Quantitative nuclease protection assay (qnpa) and sequencing (qnps) improvements

ABSTRACT

The present disclosure provides an improvement to quantitative Nuclease Protection Assay (qNPA) and quantitative Nuclease Protection Sequencing (qNPS) methods. The disclosed methods use nuclease protection probes (NPPs) that include 5′-end and/or 3-end flanking sequences, which provide a universal hybridization and/or amplification sequence. The disclosed methods can be used to sequence or detect target nucleic acid molecules, such as those present in fixed or insoluble samples.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation of U.S. application Ser. No. 14/115,032, filedOct. 31, 2013, now U.S. Pat. No. ______, which is the U.S. NationalStage of International Application No. PCT/US2012/035260, filed Apr. 26,2012, which was published in English under PCT Article 21(2), which inturn claims the benefit of U.S. Provisional Application No. 61/482,486,filed May 4, 2011, U.S. Provisional Application No. 61/537,492, filedSep. 21, 2011, and U.S. Provisional Application No. 61/576,143, filedDec. 15, 2011, all herein incorporated by reference.

ACKNOWLEDGMENT OF GOVERNMENT SUPPORT

This invention was made with government support under Grant No.5R43HG005949-02 awarded by The National Institutes of Health, NationalHuman Genome Research Institute. The government has certain rights inthe invention.

FIELD

The present disclosure provides improved quantitative nucleaseprotection assay (qNPA) and quantitative nuclease protection sequencing(qNPS) methods. Such methods can be used in the identification,detection and/or sequencing of nucleic acid targets.

BACKGROUND

Although methods of detecting and sequencing nucleic acid molecules areknown, there is still a need for methods that permit analysis ofmultiple samples or multiple sequences simultaneously orcontemporaneously. Methods of multiplexing nucleic acid moleculedetection or sequencing reactions have not been realized at the mostdesired performance or simplicity levels.

SUMMARY

Methods are provided that greatly improve prior quantitative nucleaseprotection assay (qNPA) and quantitative nuclease protection sequencing(qNPS) methods and represent an improvement to current nucleic aciddetection and sequencing methods. These methods can be used in theidentification, detection and/or sequencing of nucleic acid moleculetargets. The methods utilize a nuclease protection probe that includesone or more flanking sequences (NPPFs). The NPPFs include a sequencethat is complementary to all or a portion of the target nucleic acidmolecule, thus permitting specific binding or hybridization between thetarget nucleic acid molecule and the NPPF. For example, the region ofthe NPPF that is complementary to a region of the target nucleic acidmolecule binds to or hybridizes to that region of the target nucleicacid molecule with high specificity (and in some examples can also bindto a region of a bifunctional linker). The NPPFs further include one ormore flanking sequences at the 5′-end and/or 3′-end of the NPPF. Thus,the one or more flanking sequences are located 5′, 3′, or both, to thesequence complementary to the target nucleic acid molecule. If the NPPFincludes a flanking sequence at both the 5′-end and 3′-end, in someexamples the sequence of each NPPF is different and not complementary.The flanking sequence(s) includes several contiguous nucleotides havinga sequence (such as a sequence of at least 12 nucleotides) not found ina nucleic acid molecule present in the sample, and provide a universalhybridization and/or amplification sequence. This universalhybridization and/or amplification sequence, when having a sequencecomplementary to at least a portion of an amplification primer, permitsmultiplexing, as the same amplification primers can be used to amplifyNPPFs specific for different target nucleic acid molecules. It alsoprovides a universal hybridization sequence for all NPPFs, which can beused to add a detectable label to the NPPF or to capture and concentrateNPPFs. For example, if the same flanking sequence is present on NPPFsspecific for different target nucleic acid molecules, the same primercan be used to amplify any NPPF having the same flanking sequence, evenif the NPPF targets a different nucleic acid molecule. For example, theflanking sequence can be used to capture NPPFs, such as onto a surface.The flanking sequence can contain a variable sequence, such as asequence that is specific for each specific NPPF and can be used toeither capture that NPPF on a surface or for other purposes, such as toidentify the NPPF. Thus, in some examples, the disclosed methods areused to detect or sequence several different target nucleic acidmolecules in a sample using a plurality of NPPFs, wherein each NPPFspecifically binds to a particular target nucleic acid molecule. In oneexample, the disclosed methods are used to detect or sequence at leastone target nucleic acid molecule in a plurality of samplessimultaneously.

The disclosure provides methods for detecting or determining a sequenceof at least one target nucleic acid molecule in a sample. The methodscan include contacting the sample (such as one that has been heated todenature nucleic acid molecules in the sample) with at least one NPPFunder conditions sufficient for the NPPF to specifically bind to thetarget nucleic acid molecule. The NPPF molecule includes a sequencecomplementary to all or a portion of the target nucleic acid molecule.This permits specific binding or hybridization between the NPPF and thetarget nucleic acid molecule. The method further includes contacting thesample with one or more nucleic acid molecules having a sequence that iscomplementary to all or a portion of a flanking sequence (such amolecule is referred to herein as a CFS) under conditions sufficient forthe flanking sequence to specifically bind or hybridize to the CFS. Morethan one CFS can be used to hybridize to an entire flanking sequence(e.g., multiple individual CFSs can be hybridized to a single flankingsequence, such that the entire flaking sequence is covered). Thisresults in the generation of NPPF molecules that have bound (hybridized)thereto the target nucleic acid molecule, as well as the CFS(s), therebygenerating a double-stranded molecule, which can include at least fourcontiguous oligonucleotide sequences, with all bases engaged inhybridization to a complementary base

After allowing the target nucleic acid molecule and the CFS(s) to bindto the NPPFs, the method can further include contacting the sample witha nuclease specific for single-stranded (ss) nucleic acid molecules (orss regions of a nucleic acid molecule) under conditions sufficient toremove nucleic acid bases that are not hybridized to a complementarybase. Thus for example, NPPFs that have not bound target nucleic acidmolecule or CFSs, as well as unbound target nucleic acid molecules,other ss nucleic acid molecules in the sample, and unbound CFSs, will bedegraded. This generates a digested sample that includes intact NPPFspresent as double-stranded adducts with CFS(s) and target nucleic acidmolecule. In some examples, the method further includes increasing thepH of the sample and/or heating it, to dissociate or remove targetnucleic acid molecules and CFSs that are bound to the NPPFs.

The NPPFs that were bound to the target nucleic acid molecule and CFSs,and thus survived treatment with the nuclease, can be amplified and/orlabeled. NPPFs in the digested sample can be amplified using one or moreamplification primers, thereby generating NPPF amplicons. At least oneamplification primer includes a region that is complementary to all or aportion of the flanking sequence of the NPPF. In some examples, the NPPFincludes a flanking sequence at both the 5′-end and 3′-end, and twoamplification primers are used, wherein one amplification primer has aregion that is complementary to the 5′-end flanking sequence and theother amplification primer has a region that is complementary to the3′-end flanking sequence.

Alternatively, instead of using the NPPFs that survived treatment withthe nuclease, the target nucleic acid strand that was hybridized to theNPPF (such as a DNA strand) can be used directly, such as amplified,labeled, detected, sequenced, or combinations thereof. For example, thetarget nucleic acid strand can be amplified using one or moreamplification primers, thereby generating target amplicons, which can bedetected and/or sequenced. Thus, although NPPF amplicons are referred toherein, one will appreciated that target amplicons can be substitutedtherefor.

The resulting amplicons (or portion thereof, such as a 3′-portion) canthen be sequenced or detected. In one example, amplicons are attached toa substrate. For example, the substrate can include at least one captureprobe having a sequence complementary to all or a portion of a flankingsequence on the NPPF amplicon, thus permitting capture of the NPPFamplicons having the complementary flanking sequence. Alternatively, thesubstrate can include at least one anchor in association with abifunctional linker, wherein the bifunctional linker includes a firstportion which specifically binds to the anchor and a second portionwhich specifically binds to a portion of one of the NPPF amplicons. Thecaptured NPPF amplicons can then be sequenced or detected, therebydetermining the sequence of, or detecting, the at least one targetnucleic acid molecule in the sample.

In other examples, the NPPF amplicons are detected or sequenced withoutcapture onto an array. For example, the NPPF amplicons can betransferred to a sequencing platform.

The NPPF can be labeled with a detectable label, for example duringamplification, or as a step without amplification. Alternatively, one orboth flanking regions can be used to hybridize a detectable label to theNPPF.

The foregoing and other objects and features of the disclosure willbecome more apparent from the following detailed description, whichproceeds with reference to the accompanying figures.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram showing an exemplary nuclease protectionprobe having flanking sequences (NPPF), 100. The NPPF 100 includes aregion 102 having a sequence that specifically binds to the targetnucleic acid sequence and to a bifunctional (or programming) linker. TheNPPF also includes a 5′-flanking sequence 104 and a 3′-flanking sequence106.

FIG. 2 is a schematic diagram showing the initial steps of a method ofusing the NPPFs 202 to detect or sequence a nucleic acid molecule usingthe disclosed methods. The dashed bars represent an NPPF specific for afirst target (in some examples the NPPF is labeled with biotin (B)), thesolid gray bars represent an NPPF specific for a second target (in someexamples the NPPF is labeled with B), the dotted green bars representnucleic acid molecules that are complementary to the flanking sequences(CFS) 204 of the NPPF, and the solid black bars represent the targetnucleic acid 200 (e.g., DNA or RNA). The biotin can be added duringamplification by using a primer that is biotin (or digoxin) labeled.Alternatively, the primer can be labeled with another label (such as afluorophore), resulting in an NPPF that is labeled. (1) A sample (suchas cells or FFPE tissue) is contacted with sample disruption buffer (forexample to permit lysis of cells and tissues in the sample) andincubated with the NPPFs and CFSs. (2) Unbound (e.g., single-stranded)nucleic acid is digested with a nuclease specific for ss nucleic acidmolecules (such as S1 nuclease). (3) The nuclease can be inactivated andthe NPPFs dissociated from bound target molecules and bound CFSs, forexample by addition of base and heating. (4) The remaining NPPFs areamplified, for example by using PCR with appropriate primers 208. Insome examples, the primers 208 include a detectable label, to permitlabeling of the resulting amplicons 210. The resulting amplicons 210 canbe detected (FIG. 3) or sequenced (FIG. 4).

FIG. 3 is a schematic diagram showing how NPPF amplicons 210 can be (5)captured on an array 212 that includes bifunctional linkers 216associated with anchors 214 or that includes nucleic acid capturemolecules 220. The bifunctional linker 216 includes a region that iscomplementary to a region of the NPPF amplicons 210 (such ascomplementary to a sequence that had been hybridized to the targetnucleic acid), and a region that is complementary to a portion of theanchor. The nucleic acid capture molecules 220 include a region that iscomplementary to a region of the NPPF amplicons 210 (such as to aflanking sequence or portion thereof). (6) In one example,avidin-horseradish peroxidase (HRP) is used to detect the bound NPPFsand (7) the array is imaged following addition of substrate. Thelocation of the signal on the array allows identification of signalgenerated by a target nucleic acid molecule.

FIG. 4 is a schematic diagram showing that NPPF amplicons 210 can be (5)sequenced.

FIGS. 5A-B are schematic diagrams showing details of the nucleic acidmolecules as they are processed during the steps of a method of usingthe NPPFs 402 to detect or sequence a nucleic acid molecule using thedisclosed methods. The longer solid colored bars represent targetnucleic acid molecules 400, the bars with lighter and darker colors ontheir ends are NPPFs 402 specific for a target, with the differentcolored ends 404 representing the flanking sequences. The color of thetarget is matched to the color of its corresponding NPPF. The shortersolid color bars represent nucleic acid molecules that are complementaryto the flanking sequences (CFS) 406 of the NPPF.

FIGS. 6A-F are schematic drawings showing exemplary embodiments of NPPFmolecules, including embodiments with (A and B) a flanking sequence onlyon one end of the NPPF or (C-F) with flanking sequences on both ends ofthe NPPF.

FIG. 7 is a bar graph showing the number of amplicons detected for eachof seven unique NPPFs. Error bars represent one standard deviation fromthe mean.

FIG. 8 is a bar graph comparing the observed ratios for each of the 7unique NPPFs, to the ratios expected based on the amount of NPPF addedto the original PCR reaction.

FIG. 9 is a bar graph and tables comparing the detected NPPF probeswithout (normal) or with amplification (extreme sensitivity, ES). Eachexperiment was performed in triplicate. The PCR amplified reactions werediluted before capture and measurement.

FIG. 10 is a table showing input material, NPPF types, and theexperiment tags used to sequence cell line lysates. Each experiment wasperformed in duplicate or triplicate.

FIG. 11 is a bar graph showing the sequencing counts of forty-six NPPFsfrom a triplicate sequencing experiment using THP1 cell lysates. Theerror bars represent 1 standard deviation from the mean.

FIGS. 12A and 12B are line plots of sequencing counts of NPPFs fromtitration sequencing. This experiment looked at output linearity over aninput range, as well as the range and limits of detection of the qNPScounting method. Four concentrations of THP1 cell lysate were used asinput material. (A) shows the eight NPPFs with the lowest counts, (B)shows the four NPPFs with the highest counts. This experiment wasperformed in triplicate; this plot shows the result from only onereplicate.

FIG. 13 is a line plot of sequencing counts of NPPFs to measure miRNAs.Three concentrations of HepG2 cell lysate were used as input material.Counts for five representative NPPFs are shown. This experiment wasperformed in triplicate; this plot shows the result from only onereplicate.

FIG. 14 is a bar graph of sequencing counts from NPPFs that wereamplified using a range of PCR cycle numbers and input amounts. Each barrepresents one qNPA experiment using one of three different inputconcentrations and one of three PCR cycle numbers. All experiments werenormalized so that the total number of reads was set equal, tofacilitate comparing the results.

FIGS. 15A and 15B are bar graphs representing the NPPFs in a triplicatereactions that was split and either (A) hybridized to an array or (B)sequenced and counted. Triplicate reactions were averaged and error barsrepresent one standard deviation from the mean.

SEQUENCE LISTING

The nucleic acid sequences listed herein are shown using standard letterabbreviations for nucleotide bases, as defined in 37 C.F.R. 1.822. Onlyone strand of each nucleic acid sequence is shown, but the complementarystrand is understood as included by any reference to the displayedstrand. In the provided sequences:

SEQ ID NOS: 1-16 provide exemplary anchor nucleic acid sequences thatcan be used with the disclosed methods.

SEQ ID NOS: 17 and 18 provide exemplary 5′- and 3′-flanking sequences,respectively, which can be used with an NPPF.

SEQ ID NOS: 19 and 20 provide exemplary PCR primers.

SEQ ID NOS 21-44 provide exemplary primers containing barcode sequencespresent at nucleotides 25-30.

DETAILED DESCRIPTION I. Overview

The present disclosure provides improved methods of detecting orsequencing a target nucleic acid molecule, which permits multiplexing.The disclosed methods provide several improvements over currentlyavailable sequencing and detection methods. For example, because themethods require less processing of the target nucleic acid molecules,bias introduced by such processing can be reduced or eliminated. Forexample, in current methods, for example when the target is an RNA,methods typically employ steps to isolate or extract the RNA from asample, subject it to RT-PCR, ligate the RNA, or combinations thereof.In the disclosed methods, such steps are not required. As a result, themethods permit one to analyze a range of sample types not otherwiseamenable to detection sequencing. In addition, this results in less lossof the RNA from the sample, providing a more accurate result. It alsoreduces enzyme bias. The disclosed methods also provide for targeteddetection and sequencing of a desired nucleic acid molecule. Thisgreatly simplifies data analysis. Current whole genome sequencingmethods are challenged by the large amount of data generated, and theneed for complicated bioinformatics. Although costs of sequencing havedecreased, the ability to determine sequences is outrunning the abilityof researchers to store, transmit and analyze the data. As a result,there is commonly more data generated than can be analyzed in areasonable amount of time. Because the disclosed methods are targeted,it can overcome these obstacles. For example, the amount of datagenerated is simplified, as only a portion of the target needs to bedetected or sequenced. Long reads of nucleotides are not required, nordo fragments of sequences need to be properly aligned to a referencesequence. In addition, the results can be simply counted, without theneed for complicated bioinformatics analysis.

For example, the method can be used to detect DNA or RNA, mutations suchas gene fusions, insertions or deletions, tandem repeats, singlenucleotide polymorphisms (SNPs), and DNA methylation. The method uses aprobe, referred to herein as a nuclease protection probe comprising aflanking sequence (NPPF). The use of NPPF permits multiplexing, andconserves the stoichiometry of the detected or sequenced target nucleicacid molecule, because the flanking sequences on the probe permituniversal primer binding sites for amplification and permit addition ofsequencing adapters and experimental tags (at either the 3′- or the5′-end, or at both ends for example to increase multiplexing), withoutdestroying the stoichiometry. As the flanking sites can be universal,the same primers can be used to amplify any NPPF for any targetsequence, thus allowing for multiplexing and conservation ofstoichiometry. In one example, by amplifying from both ends of the NPPF,the disclosed methods provide greater specificity than prior qNPA andqNPS methods. Only NPPFs with intact 3′- and 5′-flanking sequences willbe amplified exponentially, while NPPFs cleaved by the nuclease will notbe amplified sufficiently to be sequenced or detected.

In addition, the primers permit addition of tags (such as experimenttags to permit the identification of the target without necessitatingthe sequencing of the entire NPPF itself or to permit samples fromdifferent patients to be combined into a single run, at either the 3′-or the 5′-end, or at both ends for example to increase multiplexing, aswell as sequencing adapters to permit attachment of a sequence neededfor a particular sequencing platform and formation of colonies for somesequencing platforms). The use of NPPFs also simplifies the complexityof the sample that is analyzed (e.g., sequenced), as it reduces thesample containing for example whole genes to the NPPFs (or NPPF ortarget amplicons). The sequencing of NPPFs (or the target hybridized tothe NPPF) simplifies data analysis compared to that required for othersequencing methods, reducing the algorithm to simply count the matchesto the NPPFs that were added to the sample, rather than having to matchsequences to the genome and deconvolute the multiple sequences/gene thatare obtained from standard methods of sequencing. In some examples, thedisclosed methods increase the signal obtained as compared to prior qNPAand qNPS methods, such as an increase of at least 10-fold, at least100-fold, at least 125-fold, at least 150-fold, or at least 200-foldwithout substantial dilution of the NPPF product before performing theamplification.

In one example, the disclosure provides methods for detecting at leastone target nucleic acid molecule in a sample (such as at least 2, atleast 3, at least 4, at least 5, at least 10, at least 20, at least 30,at least 40, at least 50, or at least 100 target nucleic acid molecules)or a determining a sequence of at least one target nucleic acid moleculein a sample. In some examples, the sample is heated to denature nucleicacid molecules in the sample, for example to permit subsequenthybridization between the NPPF and the target nucleic acid molecules inthe sample. In some examples, the sample is a lysed sample. In someexamples, the sample is a fixed sample (such as a paraffin-embeddedformalin-fixed (FFPE) sample, hematoxylin and eosin stained tissues, orglutaraldehyde fixed tissues). For example, the target nucleic acidmolecules can be fixed, cross-linked, or insoluble.

The methods can include contacting the sample with at least one nucleaseprotection probe comprising a flanking sequence (NPPF) under conditionssufficient for the NPPF to specifically bind to the target nucleic acidmolecule. In some examples, the disclosed methods sequence or detect atleast one target nucleic acid molecule in a plurality of samplessimultaneously or contemporaneously. In some examples, the disclosedmethods sequence or detect two or more target nucleic acid molecules ina sample (for example simultaneously or contemporaneously). In such anexample, the sample is contacted with a plurality of NPPFs, wherein eachNPPF specifically binds to a particular target nucleic acid molecule.For example, if there are 10 target nucleic acid molecules, the samplecan be contacted with 10 different NPPFs each specific for one of the 10targets. In some examples, at least 10 different NPPFs are incubatedwith the sample. However, it is appreciated that in some examples, morethan one NPPF (such as 2, 3, 4, 5, 10, 20, or more) specific for asingle target nucleic acid molecule can be used, such as a population ofNPPFs that are specific for different regions of the target, or apopulation of NPPFs that can bind to the target and variations thereof(such as those having mutations or polymorphisms).

The NPPF molecule includes a 5′-end and a 3′-end, as well as a sequencein between that is complementary to all or a part of the target nucleicacid molecule. This permits specific binding or hybridization betweenthe NPPF and the target nucleic acid molecule. For example, the regionof the NPPF that is complementary to a region of the target nucleic acidmolecule binds to or hybridizes to that region of the target nucleicacid molecule with high specificity. The NPPF can be complementary toall of, or a portion of, the target nucleic acid sequence. The NPPFmolecule further includes one or more flanking sequences, which are atthe 5′-end and/or 3′-end of the NPPF. Thus, the one or more flankingsequences are located 5′, 3′, or both, to the sequence complementary tothe target nucleic acid molecule. Each flanking sequence includesseveral contiguous nucleotides, generating a sequence that is not foundin a nucleic acid molecule present in the sample (such as a sequence ofat least 12 contiguous nucleotides). If the NPPF includes a flankingsequence at both the 5′-end and 3′-end, in some examples the sequence ofeach NPPF is different and not complementary to each other.

The flanking sequence(s) provide a universal hybridization/amplificationsequence, which is complementary to at least a portion of anamplification primer. In some examples, the flanking sequence caninclude (or permit addition of) an experimental tag, sequencing adapter,or combinations thereof. For example, the experimental tag can be asequence complementary to a capture probe that permits capture NPPFs,for example onto a surface (such as at a specific spot on the surface,or to a specific bead). In some examples, the experimental tag can be asequence that identifies an NPPF, such as a tag specific for aparticular patient or target sequence, for example to permit one todistinguish or group such tagged NPPFs. In some examples, the sequencingadapter a sequence that permits an NPPF amplicon to be used with aparticular sequencing platform.

The NPPF can be any nucleic acid molecule, such as a DNA or RNAmolecule, and can include unnatural bases. In some examples the NPPF isat least 35 nucleotides, such as 40 to 80 or 50 to 150 nucleotides. Theportion of the NPPF that is complementary to a region of the targetnucleic acid molecule can be at least 6 nucleotides in length, such asat least 10, at least 25, or at least 60, such as 6 to 60 nucleotides inlength. The flanking sequence(s) of the NPPF can be at least 6nucleotides, at least 12 nucleotides, or at least 25 nucleotides, suchas 12 to 50 nucleotides in length. In some examples, the NPPF includestwo flanking sequences: one at the 5′-end and the other at the 3′-end.In some examples, the flanking sequence at the 5′-end differs from theflanking sequence at the 3′-end. In addition, if the NPPF includes twoflanking sequences, ideally the two flanking sequences have a similarmelting temperature (Tm), such as a Tm of +/−5° C.

The method further includes contacting the sample with a nucleic acidmolecule having a sequence that is complementary to the flankingsequence (such a molecule is referred to herein as a CFS) underconditions sufficient for the flanking sequence to specifically bind orhybridize to the CFS. One skilled in the art will appreciate thatinstead of using a single CFS to protect a flanking sequence, multipleCFSs can be used to protect a flanking sequence. This results in thegeneration of NPPF molecules that have bound thereto the target nucleicacid molecule, as well as the CFS, thereby generating a double-strandedmolecule that includes at least three contiguous oligonucleotidesequences, with all bases engaged in hybridization to a complementarybase, which bases of the NPPF and CFSs can include unnatural bases. TheCFS hybridizes to and thus protects its corresponding flanking sequencefrom digestion with the nuclease in subsequent steps. In some examples,each CFS is the exact length of its corresponding flanking sequence. Insome examples, the CFS is completely complementary to its correspondingflanking sequence. However, one skilled in the art will appreciate thatthe 3′-end of a CFS that protects a 5′-end flanking sequence or the5′-end of a CFS that protects the 3′-end flanking sequence can have hada difference, such as one nucleotide at each of these positions.

After allowing the target nucleic acid molecule, as well as the CFS(s),to bind to the NPPFs, the method can further include contacting thesample with a nuclease specific for single-stranded (ss) nucleic acidmolecules or ss regions of a nucleic acid molecule, such as S1 nuclease,under conditions sufficient to remove nucleic acid bases that are nothybridized to a complementary base. Thus for example, NPPFs that havenot bound target nucleic acid molecule or CFSs, as well as unboundtarget nucleic acid molecules, other ss nucleic acid molecules in thesample, and unbound CFSs, will be degraded. This generates a digestedsample that includes intact NPPFs present as double stranded adductshybridized to CFSs and target nucleic acid. In some examples, forexample if the NPPF is composed of DNA, the nuclease can include anexonuclease, an endonuclease, or a combination thereof.

In some examples, the method further includes increasing the pH of thesample and/or heating it, for example to inactivate the nuclease, toremove target nucleic acid molecule and CFSs that are bound to theNPPFs, or combinations thereof. In some examples, the method includesreleasing the target nucleic acid (such as a DNA) from the NPPF, andthen further analyzing the released target (such as detecting orsequencing the target). In some examples the target nucleic acid is DNA,and the DNA is amplified prior to its detection or sequencing.

The NPPFs that were bound to the target nucleic acid molecule and CFSsand thus survived treatment with the nuclease can be amplified, forexample using PCR amplification. NPPFs in the digested sample can beamplified using one or more amplification primers, thereby generatingNPPF amplicons. At least one amplification primer includes a region thatis complementary to an NPPF flanking sequence. In some examples, theNPPF includes a flanking sequence at both the 5′-end and 3′-end, and twoamplification primers are used, wherein one amplification primer has aregion that is complementary to the 5′-end flanking sequence and theother amplification primer has a region that is complementary to the3′-end flanking sequence. One or both of the amplification primers caninclude a sequence that permits attachment of an experimental tag orsequencing adapter to the NPPF amplicon during the amplification, andone or both primers can be labeled to permit labeling of the NPPFamplicon. In some examples, both an experimental tag and a sequencingadapter are added, for example at opposite ends of the NPPF amplicon.For example, the use of such primers can generate an experimental tag orsequence tag extending from the 5′-end or 3′-end of the NPPF amplicon,or from both the 3′-end and 5′-end to increase the degree ofmultiplexing possible. The experimental tag can include a unique nucleicacid sequence that permits identification of a sample, subject, ortarget nucleic acid sequence. In some examples, the amplification primercontains an experimental tag that permits capture of the NPPF amplicononto a substrate (for example by hybridization to a probe on thesubstrate having a sequence complementary to the capture sequence on theNPPF amplicon). The sequencing adapter can include a nucleic acidsequence that permits capture of the resulting NPPF onto a sequencingplatform. For example, the amplification primer can include a sequencethat permits attachment of a poly-A or poly T sequence tag which canfacilitate amplification once captured onto the sequencing chip. In someexamples, the amplification primer is used to label the NPPF amplicon.In other examples, one or both flanking regions are used to hybridize adetectable label to the NPPF, such as with a labeled probe (for examplewithout amplification).

The resulting NPPF (or target) amplicons (or portion thereof, such as a3′-portion) can then be sequenced or detected, thereby determining thesequence of, or detecting, the at least one target nucleic acid moleculein the sample.

In one example, the NPPF amplicons (or portion thereof) is sequenced.Any method can be used to sequence the NPPF amplicons, and thedisclosure is not limited to particular sequencing methods. In someexamples, the sequencing method used is Solexa® sequencing, 454®sequencing, chain termination sequencing, dye termination sequencing, orpyrosequencing. In some examples, single molecule sequencing is used. Insome examples where the NPPF amplicons are sequenced, the method alsoincludes comparing the obtained NPPF sequence to a reference sequencedatabase; and determining the number of each identified NPPF sequence.

In some examples, the NPPF amplicons are detected. In such examples, themethod can include contacting the NPPF amplicons with a surface, such asone having multiple spatially discrete regions. In one example, the NPPFamplicons are captured by one or more nucleic acid capture molecules onthe surface, wherein the sequences of the nucleic acid capture moleculeson the surface are complementary to at least a portion of a flankingsequence on the NPPF amplicon. This complementarity permitshybridization and binding of the NPPF amplicons to the capture moleculeson the surface. Such capture molecules can be directly conjugated to thesurface. The NPPF amplicons are incubated or contacted with the surfaceunder conditions sufficient for the NPPF amplicons to specifically bindto the capture molecules on the surface. In some examples, the NPPFamplicons are contacted with a population of surfaces, wherein thepopulation of surfaces includes subpopulations of surfaces (such as apopulation of beads), and wherein each subpopulation of surfacescomprises at least one nucleic acid capture molecule complementary to atleast a portion of a flanking sequence on the NPPF amplicon. Thus, thispermits capture of all NPPFs having a sequence complementary to thecapture molecules on the surface, regardless of the sequence targeted bythe NPPF. The bound NPPF amplicons can then be detected. In someexamples, this step is used to purify or concentrate NPPF amplicons (forexample from a mixture containing primers), and the NPPF amplicons canbe subsequently released from the surface, for example by reversinghybridization (such as by increasing temperature to melt off thecaptured NPPFs or by changing pH and the temperature), and the NPPFamplicons analyzed.

In another example, the NPPF amplicons are captured onto a surface byusing anchors and bifunctional linkers. The surface can include aplurality of regions, each region including at least one anchor inassociation with a bifunctional linker. The bifunctional linker includesa first portion which specifically binds to the anchor and a secondportion which specifically binds to or hybridizes to at least a portionof one of the NPPF amplicons. The NPPF amplicons are incubated orcontacted with the surface under conditions sufficient for the NPPFamplicons to specifically bind to the second portion of the bifunctionallinker. In some examples, the NPPF amplicons are contacted with apopulation of surfaces, wherein the population of surfaces includessubpopulations of surfaces (such as a population of beads), and whereineach subpopulation of surfaces comprises at least one anchor inassociation with a bifunctional linker. The bound NPPF amplicons canthen be detected.

In addition, the NPPF amplicon can include a detectable label therebypermitting its detection. In some examples, such a label is introducedduring amplification. In specific examples, the detectable label is ahapten, a fluorescent molecule, an enzyme, or a radioisotope. Forexample, biotin present on an NPPF amplicon can be detected bycontacting the NPPF amplicons with avidin or streptavidin conjugated tohorseradish peroxidase or alkaline phosphatase.

II. Terms

Unless otherwise noted, technical terms are used according toconventional usage. Definitions of common terms in molecular biology maybe found in Benjamin Lewin, Genes VII, published by Oxford UniversityPress, 2000 (ISBN 019879276X); Kendrew et al. (eds.), The Encyclopediaof Molecular Biology, published by Blackwell Publishers, 1994 (ISBN0632021829); Robert A. Meyers (ed.), Molecular Biology andBiotechnology: a Comprehensive Desk Reference, published by Wiley, John& Sons, Inc., 1995 (ISBN 0471186341); and George P. Redei, EncyclopedicDictionary of Genetics, Genomics, and Proteomics, 2nd Edition, 2003(ISBN: 0-471-26821-6).

The following explanations of terms and methods are provided to betterdescribe the present disclosure and to guide those of ordinary skill inthe art to practice the present disclosure. The singular forms “a,”“an,” and “the” refer to one or more than one, unless the contextclearly dictates otherwise. For example, the term “comprising a cell”includes single or plural cells and is considered equivalent to thephrase “comprising at least one cell.” The term “or” refers to a singleelement of stated alternative elements or a combination of two or moreelements, unless the context clearly indicates otherwise. As usedherein, “comprises” means “includes.” Thus, “comprising A or B,” means“including A, B, or A and B,” without excluding additional elements.

To facilitate review of the various embodiments of this disclosure, thefollowing explanations of specific terms are provided:

3′ end: The end of a nucleic acid molecule that does not have anucleotide bound to it 3′ of the terminal residue.

5′ end: The end of a nucleic acid sequence where the 5′ position of theterminal residue is not bound by a nucleotide.

Amplifying a nucleic acid molecule: To increase the number of copies ofa nucleic acid molecule, such as an NPPF or portion thereof. Theresulting products are called amplification products or amplicons. Anexample of in vitro amplification is the polymerase chain reaction(PCR), in which a sample (such as a sample containing NPPFs) iscontacted with a pair of oligonucleotide primers, under conditions thatallow for hybridization of the primers to a nucleic acid molecule in thesample. The primers are extended under suitable conditions, dissociatedfrom the template, and then re-annealed, extended, and dissociated toamplify the number of copies of the nucleic acid molecule.

Binding or stable binding (of a nucleic acid): A first nucleic acidmolecule (such as an NPPF) binds or stably binds to another nucleic acidmolecule (such as a target nucleic acid molecule) if a sufficient amountof the first nucleic acid molecule forms base pairs or is hybridized tothe other nucleic acid molecule, for example the binding of a NPPF toits complementary target nucleic acid sequence.

Binding can be detected by either physical or functional properties.Binding between nucleic acid molecules can be detected by any procedureknown to one skilled in the art, including both functional (for examplereduction in expression and/or activity) and physical binding assays.

Complementary: Ability to from base pairs between nucleic acids.Oligonucleotides and their analogs hybridize by hydrogen bonding, whichincludes Watson-Crick, Hoogsteen or reversed Hoogsteen hydrogen bonding,between complementary bases. Generally, nucleic acid molecules consistof nitrogenous bases that are either pyrimidines (cytosine (C), uracil(U), and thymine (T)) or purines (adenine (A) and guanine (G)). Thesenitrogenous bases form hydrogen bonds between a pyrimidine and a purine,and the bonding of the pyrimidine to the purine is referred to as “basepairing.” More specifically, A will hydrogen bond to T or U, and G willbond to C. “Complementary” refers to the base pairing that occursbetween to distinct nucleic acids or two distinct regions of the samenucleic acid.

“Specifically hybridizable” and “specifically complementary” are termsthat indicate a sufficient degree of complementarity such that stableand specific binding occurs between the probe (for example, an NPPF) orits analog and the nucleic acid target (such as DNA or RNA target). Theprobe or analog need not be 100% complementary to its target sequence tobe specifically hybridizable. A probe or analog is specificallyhybridizable when there is a sufficient degree of complementarity toavoid non-specific binding of the probe or analog to non-targetsequences under conditions where specific binding is desired, forexample in the methods disclosed herein.

Conditions sufficient for: Any environment that permits the desiredactivity, for example, that permit specific binding or hybridizationbetween two nucleic acid molecules (such as an NPPF and a target nucleicacid, an NPPF and a CFS, or between an NPPF and a bifunctional linker)or that permit a nuclease to remove (or digest) unbound nucleic acids.

Contact: Placement in direct physical association; includes both insolid and liquid form. For example, contacting can occur in vitro with anucleic acid probe (e.g., an NPPF) and biological sample in solution.

Detect: To determine if an agent (such as a signal, particularnucleotide, amino acid, nucleic acid molecule, and/or organism) ispresent or absent. In some examples, this can further includequantification. For example, use of the disclosed methods permitdetection of target nucleic acid molecules in a sample.

Detectable label: A compound or composition that is conjugated directlyor indirectly to another molecule (such as a nucleic acid molecule, forexample an NPPF or an amplification primer/probe) to facilitatedetection of that molecule. Specific, non-limiting examples of labelsinclude fluorescent and fluorogenic moieties, chromogenic moieties,haptens, affinity tags, and radioactive isotopes. The label can bedirectly detectable (e.g., optically detectable) or indirectlydetectable (for example, via interaction with one or more additionalmolecules that are in turn detectable). Exemplary labels in the contextof the probes disclosed herein are described below. Methods for labelingnucleic acids, and guidance in the choice of labels useful for variouspurposes, are discussed, e.g., in Sambrook and Russell, in MolecularCloning: A Laboratory Manual, 3^(rd) Ed., Cold Spring Harbor LaboratoryPress (2001) and Ausubel et al., in Current Protocols in MolecularBiology, Greene Publishing Associates and Wiley-Intersciences (1987, andincluding updates).

Hybridization: The ability of complementary single-stranded DNA, RNA, orDNA/RNA hybrids to form a duplex molecule (also referred to as ahybridization complex). Nucleic acid hybridization techniques can beused to form hybridization complexes between a nucleic acid probe, andthe gene it is designed to target.

“Specifically hybridizable” and “specifically complementary” are termsthat indicate a sufficient degree of complementarity such that stableand specific binding occurs between a first nucleic acid molecule (orits analog) and a second nucleic acid molecule (such as a nucleic acidtarget, for example, a DNA or RNA target). The first and second nucleicacid molecules need not be 100% complementary to be specificallyhybridizable. Specific hybridization is also referred to herein as“specific binding.”

Hybridization conditions resulting in particular degrees of stringencywill vary depending upon the nature of the hybridization method and thecomposition and length of the hybridizing nucleic acid sequences.Generally, the temperature of hybridization and the ionic strength (suchas the Na⁺ concentration) of the hybridization buffer will determine thestringency of hybridization. Calculations regarding hybridizationconditions for attaining particular degrees of stringency are discussedin Sambrook et al., (1989) Molecular Cloning, second edition, ColdSpring Harbor Laboratory, Plainview, N.Y. (chapters 9 and 11).

Nuclease: An enzyme that cleaves a phosphodiester bond. An endonucleaseis an enzyme that cleaves an internal phosphodiester bond in anucleotide chain (in contrast to exonucleases, which cleave aphosphodiester bond at the end of a nucleotide chain). Endonucleasesinclude restriction endonucleases or other site-specific endonucleases(which cleave DNA at sequence specific sites), DNase I, Bal 31 nuclease,S1 nuclease, Mung bean nuclease, Ribonuclease A, Ribonuclease T1, RNaseI, RNase PhyM, RNase U2, RNase CLB, micrococcal nuclease, andapurinic/apyrimidinic endonucleases. Exonucleases include exonucleaseIII and exonuclease VII. In particular examples, a nuclease is specificfor single-stranded nucleic acids, such as S1 nuclease, Mung beannuclease, Ribonuclease A, or Ribonuclease T1.

Nucleic acid: A deoxyribonucleotide or ribonucleotide polymer in eithersingle or double stranded form, and unless otherwise limited,encompassing analogs of natural nucleotides that hybridize to nucleicacids in a manner similar to naturally occurring nucleotides. The term“nucleotide” includes, but is not limited to, a monomer that includes abase (such as a pyrimidine, purine or synthetic analogs thereof) linkedto a sugar (such as ribose, deoxyribose or synthetic analogs thereof),or a base linked to an amino acid, as in a peptide nucleic acid (PNA). Anucleotide is one monomer in a polynucleotide. A nucleotide sequencerefers to the sequence of bases in a polynucleotide.

A target nucleic acid (such as a target DNA or RNA) is a nucleic acidmolecule whose detection, amount, or sequence is intended to bedetermined (for example in a quantitative or qualitative manner). In oneexample, the target is a defined region or particular portion of anucleic acid molecule, for example a DNA or RNA of interest. In anexample where the target nucleic acid sequence is a target DNA or atarget RNA, such a target can be defined by its specific sequence orfunction; by its gene or protein name; or by any other means thatuniquely identifies it from among other nucleic acids.

In some examples, alterations of a target nucleic acid sequence (e.g., aDNA or RNA) are “associated with” a disease or condition. That is,detection of the target nucleic acid sequence can be used to infer thestatus of a sample with respect to the disease or condition. Forexample, the target nucleic acid sequence can exist in two (or more)distinguishable forms, such that a first form correlates with absence ofa disease or condition and a second (or different) form correlates withthe presence of the disease or condition. The two different forms can bequalitatively distinguishable, such as by nucleotide polymorphisms ormutation, and/or the two different forms can be quantitativelydistinguishable, such as by the number of copies of the target nucleicacid sequence that are present in a sample.

Nucleotide: The fundamental unit of nucleic acid molecules. A nucleotideincludes a nitrogen-containing base attached to a pentose monosaccharidewith one, two, or three phosphate groups attached by ester linkages tothe saccharide moiety.

The major nucleotides of DNA are deoxyadenosine 5′-triphosphate (dATP orA), deoxyguanosine 5′-triphosphate (dGTP or G), deoxycytidine5′-triphosphate (dCTP or C) and deoxythymidine 5′-triphosphate (dTTP orT). The major nucleotides of RNA are adenosine 5′-triphosphate (ATP orA), guanosine 5′-triphosphate (GTP or G), cytidine 5′-triphosphate (CTPor C) and uridine 5′-triphosphate (UTP or U).

Nucleotides include those nucleotides containing modified bases,modified sugar moieties and modified phosphate backbones, for example asdescribed in U.S. Pat. No. 5,866,336 to Nazarenko et al. (hereinincorporated by reference). Includes nucleotides containing othermodifications, such as found in locked nucleic acids (LNAs). Thus, theNPPFs, primers, CFSs, bifunctional linkers, and anchors disclosed hereincan include natural and unnatural bases.

Examples of modified base moieties which can be used to modifynucleotides at any position on its structure include, but are notlimited to: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,hypoxanthine, xanthine, acetylcytosine, 5-(carboxyhydroxylmethyl)uracil,5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N-6-sopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid,pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil,2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acidmethylester, uracil-5-oxyacetic acid, 5-methyl-2-thiouracil,3-(3-amino-3-N-2-carboxypropyl)uracil, and 2,6-diaminopurine.

Examples of modified sugar moieties which may be used to modifynucleotides at any position on its structure include, but are notlimited to: arabinose, 2-fluoroarabinose, xylose, and hexose, or amodified component of the phosphate backbone, such as phosphorothioate,a phosphorodithioate, a phosphoramidothioate, a phosphoramidate, aphosphordiamidate, a methylphosphonate, an alkyl phosphotriester, or aformacetal or analog thereof.

Primer. A short nucleic acid molecule, such as a DNA oligonucleotide 9nucleotides or more in length, which in some examples is used toinitiate the synthesis of a longer nucleic acid sequence. Longer primerscan be about 10, 12, 15, 20, 25, 30 or 50 nucleotides or more in length.Primers can be annealed to a complementary nucleic acid strand bynucleic acid hybridization to form a hybrid between the primer and thecomplement strand, and then the primer extended along the complementstrand by a polymerase enzyme. Primer pairs can be used foramplification of a nucleic acid sequence, for example by PCR or othernucleic-acid amplification methods.

In one example, a primer includes a label, which can be referred to as aprobe.

Probe: A nucleic acid molecule capable of hybridizing with a targetnucleic acid molecule (e.g., a target DNA or RNA) and, when hybridizedto the target, is capable of being detected either directly orindirectly. Thus probes permit the detection, and in some examplesquantification, of a target nucleic acid molecule, such as a DNA or RNA.In some examples, a probe includes a detectable label.

Nuclease protection probe (NPP): A nucleic acid molecule having asequence that is complementary to a target DNA or RNA and is capable ofhybridizing to the target DNA or RNA. The NPP protects the complementarytarget DNA or RNA nucleic acid molecule from cleavage by a nuclease,such as a nuclease specific for single-stranded nucleic acids. Anuclease protection probe comprising a flanking sequence (NPPF) is anNPP that further includes one or more flanking sequences at the 5′-end,3′-end, or both, wherein the flanking sequence includes a sequence ofcontiguous nucleotides not found in a nucleic acid molecule present inthe sample, and which can provide a universal amplification sequencepoint that can be used as an attachment point for an amplificationprimer. In one example the flanking sequence is used to capture the NPPFto a substrate, wherein a nucleic acid capture sequence on the substrateand at least a portion of the flanking sequence are complementary to oneanother, thereby permitting capture of the NPPF onto the substrate.

Sample: A biological specimen containing DNA (for example, genomic DNAor cDNA), RNA (including mRNA or miRNA), protein, or combinationsthereof, obtained from a subject (such as a human or other mammaliansubject). Examples include, but are not limited to cells, cell lysates,chromosomal preparations, peripheral blood or fractions thereof, urine,saliva, tissue biopsy (such as a tumor biopsy or lymph node biopsy),surgical specimen, bone marrow, amniocentesis samples, fine needleaspirates, circulating tumor cells, and autopsy material. In oneexample, a sample includes RNA or DNA. In particular examples, samplesare used directly (e.g., fresh or frozen), or can be manipulated priorto use, for example, by fixation (e.g., using formalin) and/or embeddingin wax (such as FFPE tissue samples).

Sequence identity/similarity: The identity/similarity between two ormore nucleic acid sequences, or two or more amino acid sequences, isexpressed in terms of the identity or similarity between the sequences.Sequence identity can be measured in terms of percentage identity; thehigher the percentage, the more identical the sequences are.

Methods of alignment of sequences for comparison are well known in theart. Various programs and alignment algorithms are described in: Smith &Waterman, Adv. Appl. Math. 2:482, 1981; Needleman & Wunsch, J. Mol.Biol. 48:443, 1970; Pearson & Lipman, Proc. Natl. Acad. Sci. USA85:2444, 1988; Higgins & Sharp, Gene, 73:237-44, 1988; Higgins & Sharp,Comput. Appl. Biosci. 5:151-3, 1989; Corpet et al., Nucl. Acids Res.16:10881-90, 1988; Huang et al. Comput. Appl. Biosci. 8, 155-65, 1992;and Pearson et al., Meth. Mol. Bio. 24:307-31, 1994. Altschul et al., J.Mol. Biol. 215:403-10, 1990, presents a detailed consideration ofsequence alignment methods and homology calculations.

The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al., J.Mol. Biol. 215:403-10, 1990) is available from several sources,including the National Center for Biological Information (NCBI, NationalLibrary of Medicine, Building 38A, Room 8N805, Bethesda, Md. 20894) andon the Internet, for use in connection with the sequence analysisprograms blastp, blastn, blastx, tblastn, and tblastx. Blastn is used tocompare nucleic acid sequences, while blastp is used to compare aminoacid sequences. Additional information can be found at the NCBI website.

Once aligned, the number of matches is determined by counting the numberof positions where an identical nucleotide or amino acid residue ispresent in both sequences. The percent sequence identity is determinedby dividing the number of matches either by the length of the sequenceset forth in the identified sequence, or by an articulated length (suchas 100 consecutive nucleotides or amino acid residues from a sequenceset forth in an identified sequence), followed by multiplying theresulting value by 100.

One indication that two nucleic acid molecules are closely related isthat the two molecules hybridize to each other under stringentconditions. Stringent conditions are sequence-dependent and aredifferent under different environmental parameters. The nucleic acidprobes disclosed herein are not limited to the exact sequences shown, asthose skilled in the art will appreciate that changes can be made to asequence, and not substantially affect the ability of a probe tofunction as desired. For example, sequences having at least 80%, atleast 85%, at least 90%, at least 91%, at least 92%, at least 93%, atleast 94%, at least 95%, at least 96%, at least 97%, at least 98%, or atleast 99%, such as 100% sequence identity to the disclosed probes areprovided herein (e.g., SEQ ID NOS: 1-16). One of skill in the art willappreciate that these sequence identity ranges are provided for guidanceonly; it is possible that probes can be used that fall outside theseranges.

Sequencing: To determine the primary structure (or primary sequence) ofan unbranched biopolymer. Sequencing results in a symbolic lineardepiction known as a sequence which succinctly summarizes much of theatomic-level structure of the sequenced molecule, for example, apolynucleotide. When the molecule is a polynucleotide, such as, forexample, RNA or DNA, sequencing can be used to obtain information aboutthe molecule at the nucleotide level, which can then be used indeciphering various secondary information about the molecule itselfand/or the polypeptide encoded thereby. DNA sequencing is the process ofdetermining the nucleotide order of a given DNA molecule and RNAsequencing is the process of determining the nucleotide order of a givenRNA molecule. In some examples, sequencing of a nucleic acid molecule isdone indirectly, for example by determining the sequence of at least aportion of a nuclease protection probe comprising a flanking sequence(NPPF), which bound to the target nucleic acid molecule.

Simultaneous: Occurring at the same time or substantially the same timeand/or occurring in the same sample or the same reaction (for example,contemporaneous). In some examples, the events occur within 1microsecond to 120 seconds of one another (for example within 0.5 to 120seconds, 1 to 60 seconds, or 1 to 30 seconds, or 1 to 10 seconds).

Subject: Any multi-cellular vertebrate organism, such as human andnon-human mammals (e.g., veterinary subjects). In one example, a subjectis known or suspected of having a tumor or an infection.

Surface (or substrate): Any solid support or material which isinsoluble, or can be made insoluble by a subsequent reaction. Numerousand varied solid supports are known to those in the art and include,without limitation, nitrocellulose, the walls of wells of a reactiontray, multi-well plates, test tubes, polystyrene beads, magnetic beads,membranes, and microparticles (such as latex particles). Any suitableporous material with sufficient porosity to allow access by detectorreagents and a suitable surface affinity to immobilize capture reagents(e.g., oligonucleotides) is contemplated by this term. For example, theporous structure of nitrocellulose has excellent absorption andadsorption qualities for a wide variety of reagents, for instance,capture reagents. Nylon possesses similar characteristics and is alsosuitable. Microporous structures are useful, as are materials with gelstructure in the hydrated state.

Further examples of useful solid supports include natural polymericcarbohydrates and their synthetically modified, cross-linked orsubstituted derivatives, such as agar, agarose, cross-linked alginicacid, substituted and cross-linked guar gums, cellulose esters,especially with nitric acid and carboxylic acids, mixed celluloseesters, and cellulose ethers; natural polymers containing nitrogen, suchas proteins and derivatives, including cross-linked or modifiedgelatins; natural hydrocarbon polymers, such as latex and rubber;synthetic polymers which may be prepared with suitably porousstructures, such as vinyl polymers, including polyethylene,polypropylene, polystyrene, polyvinylchloride, polyvinylacetate and itspartially hydrolyzed derivatives, polyacrylamides, polymethacrylates,copolymers and terpolymers of the above polycondensates, such aspolyesters, polyamides, and other polymers, such as polyurethanes orpolyepoxides; porous inorganic materials such as sulfates or carbonatesof alkaline earth metals and magnesium, including barium sulfate,calcium sulfate, calcium carbonate, silicates of alkali and alkalineearth metals, aluminum and magnesium; and aluminum or silicon oxides orhydrates, such as clays, alumina, talc, kaolin, zeolite, silica gel, orglass (these materials may be used as filters with the above polymericmaterials); and mixtures or copolymers of the above classes, such asgraft copolymers obtained by initializing polymerization of syntheticpolymers on a pre-existing natural polymer.

All publications, patent applications, patents, and other referencesmentioned herein are incorporated by reference in their entirety for allpurposes. All sequences associated with the GenBank Accession Nos.mentioned herein are incorporated by reference in their entirety as werepresent on Dec. 15, 2011, to the extent permissible by applicable rulesand/or law. In case of conflict, the present specification, includingexplanations of terms, will control.

Although methods and materials similar or equivalent to those describedherein can be used to practice or test the disclosed technology,suitable methods and materials are described below. The materials,methods, and examples are illustrative only and not intended to belimiting.

III. Methods of Detecting or Sequencing Nucleic Acid Molecules

Disclosed herein are methods of detecting and/or sequencing nucleic acidmolecules present in a sample. In some examples, at least two differentnucleic acid molecules are detected in the same sample or same assay(for example, in the same well of an assay plate or array). In someexamples, the same nucleic acid molecule or molecules is detected in atleast two different samples or assays (for example, in samples fromdifferent patients).

The disclosed methods provide improvements to a quantitative nucleaseprotection assay (qNPA), for example as described in InternationalPatent Publications WO 99/032663; WO 00/037683; WO 00/037684; WO00/079008; WO 03/002750; and WO 08/121,927; and U.S. Pat. Nos.6,232,066, 6,238,869; 6,458,533; and 7,659,063, all of which areincorporated herein by reference in their entirety. See also, Martel etal., Assay and Drug Development Technologies. 2002, 1 (1-1):61-71;Martel et al., Progress in Biomedical Optics and Imaging, 2002, 3:35-43;Martel et al., Gene Cloning and Expression Technologies, Q. Lu and M.Weiner, Eds., Eaton Publishing, Natick (2002); SeligmannPharmacoGenomics, 2003, 3:36-43; Martel et al., “Array Formats” in“Microarray Technologies and Applications,” U. R. Muller and D. Nicolau,Eds, Springer-Verlag, Heidelberg (2005); Sawada et al., Toxicology inVitro, 20:1506-1513, 2006; Bakir, et al., Bioorg. & Med. Chem. Lett,17:3473-3479, 2007; Kris et al., Plant Physiol. 144:1256-1266, 2007;Roberts et al., Laboratory Investigation, 87:979-997, 2007; Rimsza etal., Blood, 2008 October 15, 112 (8):3425-3433; Pechhold et al., NatureBiotechnology, 27:1038-1042, 2009. For example, the disclosed qNPAmethods have enhanced sensitivity as compared to prior qNPA methods,such as an increase in detectable signal of at least 10-fold, at least25-fold, at least 100-fold, at least 125-fold, at least 150-fold, atleast 170-fold, or at least 200-fold. That is, at least 10-fold, or evenas much as 200-fold less sample is required, or conversely, rare genesthat were 10-times below the sensitivity, or even up to 20-times belowthe sensitivity of currently available methods are detectable with thedisclosed methods. Consequently, sample types such as fine needleaspirates which provide very small amounts of FFPE, or circulating tumorcells, where as few as 10, 50, or 100 cells may be recovered from apatient, can be tested and rare genes detected using the disclosedmethods.

In addition, the disclosed methods provide improvements to aquantitative nuclease protection sequencing (qNPS) method, for exampleas described in US Patent Publication No. US-2011-0104693. qNPS is asequencing method that uses a qNPA to convert target nucleic acidmolecules present in a sample, even when cross linked, into stablesingle-stranded nucleic acid targets (nuclease protection probes, NPPs)that can be recovered in solution without capture or separation, by useof the nuclease protection step and (as necessary) treatment with baseto dissociate the nuclease protection probes from protecting targetmolecules, and in the case of RNA, hydrolyze the RNA target. The amountsof the NPPs remaining after nuclease hydrolysis are then determined bysequencing which can include sequencing of the probes themselves. Theimproved methods disclosed herein use a variation of a NPP, a nucleaseprotection probe comprising a flanking sequence (NPPF). The use of NPPFpermits multiplexing, as well as conserving the stoichiometry of thedetected or sequenced target nucleic acid molecule, because the flankingsequences on the probe permit universal primer binding sites foramplification. As the primer binding sites are universal, the sameprimers can be used to amplify any NPPF for any target sequence, thusallowing for multiplexing and conservation of stoichiometry. In oneexample, amplifying from flanking sequences on both ends of the NPPFprovides an unexpected and greater specificity than prior qNPA and qNPSmethods. NPPFs with intact 3′- and 5′-flanking sequences will beamplified exponentially; nuclease-cleaved NPPFs will not be amplifiedsufficiently to be sequenced or detected. In contrast, NPPs processedusing prior qNPA methods can be partially cleaved at either the end ofthe sequence that is involved in capture onto the array or at the end ofthe sequence that is involved in detection on the array, or at both dueto weak or incorrect hybridization to incorrect target nucleic acids,and yet still be captured and detected, leading to a loss of specificityfor the correct target nucleic acid. This does not occur with thedisclosed NPPF probes. The disclosed methods conserve the originalnucleic acid molecule stoichiometry such that the detected or sequencednucleic acid molecules retain the same relative quantities of thenucleic acid molecules as in the test sample, such as a variation of nomore than 20%, no more than 15%, no more than 10%, no more than 9%, nomore than 8%, no more than 7%, no more than 6%, no more than 5%, no morethan 4%, no more than 3%, no more than 2%, no more than 1%, no more than0.5%, or no more than 0.1%, such as 0.001%-5%, 0.01%-5%, 0.1%-5%, or0.1%-1%.

The disclosed methods also permit multiplexing experiments, such asmultiple reactions within the same assay (such as multiple samples fromdifferent patients in the same reaction well), and multiple reactionsanalyzed within the same run/channel of the sequencer.

Specifically, in contrast to prior qNPA and qNPS methods, the disclosedmethods use modified nucleic acid protection probes (NPPs), whichinclude flanking sequences on one or both ends of the NPPs. Thesemodified NPPs with 5′-end and/or 3′-end flanking sequences are referredto herein as nucleic acid protection probes with flaking sequences(NPPFs). The presence of the one or both flanking sequences, which serveas universal primer points for hybridization and/or amplification (andcan be used for other purposes including capture or tagging of NPPFs),conserve the original nucleic acid stoichiometry in the sample as theflanking sequences are part of the NPPF. In addition, this eliminatesthe need for ligation to add priming sites, tags, and the like to theNPPFs, which can incorporate artifacts which skew the nucleic acidstoichiometry in the sample, and provide an additional source ofvariability. Eliminating the need for ligation eliminates both potentialartifact skewing stoichiometry and degrading reproducibility.

FIG. 1 is a schematic diagram showing an exemplary NPPF. The nucleaseprotection probe having at least one flanking sequence (NPPF) 100includes a region 102 that includes a sequence that specifically bindsto the target nucleic acid sequence (and can also specifically binds toa bifunctional linker). The target nucleic acid sequence can be DNA(e.g., genomic DNA or cDNA) or RNA (such as mRNA, miRNA, tRNA, siRNA),or both. The NPPF includes one or more flanking sequences 104 and 106.FIG. 1 shows an NPPF 100 with both a 5′-flanking sequence 104 and a3′-flanking sequence 106. However, NPPFs can in some examples have onlyone flanking sequence.

FIG. 2 is a schematic diagram showing the initial steps of an exemplarymethod of using the NPPFs to detect or sequence a nucleic acid moleculeusing the disclosed methods. As shown in step 1, a sample (such as oneknown or suspected of containing a target nucleic acid, 200) that hasbeen treated with a sample disruption buffer (e.g., lysed or otherwisetreated to make nucleic acids accessible) is contacted or incubated witha plurality of nuclease protection probes having one or more flankingsequences (NPPFs) 202 including at least one NPPF which specificallybinds to a first target nucleic acid (such as a target DNA or RNA). Thereaction can also include other NPPFs which specifically bind to asecond target nucleic acid, and so on. For example, the method can useone or more different NPPFs designed to be specific for each uniquetarget nucleic acid molecule. Thus, the measurement of 100 genesrequires the use of at least 100 different NPPFs, with at least one NPPFspecific per gene (such as several different NPPFs/gene). Thus, forexample, the method can use at least 2 different NPPFs, at least 3, atleast 4, at least 5, at least 10, at least 25, at least 50, at least 75,at least 100 or even at least 200 different NPPFs (such as 2 to 500, 2to 100, 5 to 10, 2 to 10, or 2 to 20 different NPPFs). However, one willappreciate that in some examples, the plurality of NPPFs can includemore than one (such as 2, 3, 4, 5, 10, 20, 50 or more) NPPFs specificfor a single target nucleic acid molecule. The dashed bars in FIG. 2represent an NPPF specific for a first target and the solid gray barsrepresent an NPPF specific for second target. In some examples, theNPPFs include a detectable label, such as biotin (B), but one skilled inthe art will appreciate that a label can be added at other steps, suchas during amplification. Thus, the biotin shown in FIG. 2 is optional,and other labels can be used. The reaction also includes nucleic acidmolecules that are complementary to the flanking sequences (CFS), 204,that are specific for the flanking sequences of the NPPF 202. FIG. 2shows the dotted green bars 204 as the CFSs specific for a flankingsequence(s) of the NPP. One skilled in the art will appreciate that thesequence of the CFSs will vary depending on the flanking sequencepresent. In addition, more than one CFS can be used to ensure a flankingregion is protected (e.g., at least two CFSs can use that bind todifferent regions of a single flanking sequence). The CFS can includenatural or unnatural bases. Although FIG. 2 shows NPPFs with flankingsequences on both ends of the NPPF; one skilled in the art willappreciate that a single flanking sequence can be used. The sample,NPPFs and CFSs are incubated under conditions sufficient for NPPFs tospecifically bind to their respective target nucleic acid molecule, andfor CFSs to bind to its their complementary sequence on the NPPFflanking sequence. In some examples, the CFSs 204 are added in excess ofthe NPPFs 202, for example at least 5-fold more CFSs than NPPFs (molarexcess), such as at least 6-fold, at least 7-fold, at least 8-fold, atleast 9-fold, at least 10-fold, at least 20-fold, at least 40-fold, atleast 50-fold, or at least 100-fold more CFSs than the NPPFs. In someexamples, the NPPFs 202 are added in excess of the total nucleic acidmolecules in the sample, for example at least 50-fold more NPPF thantotal nucleic acid molecules in the sample (molar excess), such as atleast 75-fold, at least 100-fold, at least 200-fold, at least 500-fold,or at least 1000-fold more NPPF than the total nucleic acid molecules inthe sample. For experimental convenience a similar concentration of eachNPPF can be included to make a cocktail, such that for the most abundantnucleic acid target measured there will be at least 50-fold more NPPFfor that nucleic acid target, such as an at least 100-fold excess. Theactual excess and total amount of all NPPFs used is limited only by thecapacity of the nuclease (e.g., S1 nuclease) to destroy all NPPF's thatare not hybridized to target nucleic acid targets. In some examples thereaction is heated, for example incubated for overnight at 50° C.

As shown in step 2 in FIG. 2, after allowing the binding/hybridizationreactions to occur, the sample is contacted with a nuclease specific forsingle-stranded (ss) nucleic acid molecules under conditions sufficientto remove (or digest) ss nucleic acid molecules, such as unbound nucleicacid molecules (such as unbound NPPFs, CFSs, and target nucleic acidmolecules, or portions of such molecules that remain single stranded).As shown in FIG. 2, incubation of the sample with a nuclease specificfor ss nucleic acid molecules results in degradation of any ss nucleicacid molecules, leaving intact double-stranded nucleic acid molecules,including NPPFs that have bound thereto and CFSs and target nucleic acidmolecule. For example, the reaction can be incubated at 50° C. for 1.5hours with S1 nuclease (though hydrolysis can occur at othertemperatures and be carried out for other periods of time, and in partthat the time and temperature required will be a function of the amountof nuclease, and on the amount of nucleic acid required to behydrolyzed, as well as the Tm of the double-stranded region beingprotected).

After this reaction, the samples can optionally be treated to otherwiseremove or separate non-hybridized material and/or to inactivate orremove residual enzymes (e.g., by heat, phenol extraction,precipitation, column filtration, etc.). For example, as shown in step 3the pH of the reaction can be increased to inactivate the nuclease, andthe reaction heated to destroy the nuclease. In addition, heating thereaction will also dissociate the target nucleic acid (such as targetDNA or target RNA) and the CFSs from the complementary regions on theNPPF. This leaves behind the intact NPPFs that previously bound thetarget nucleic acid molecules and CFSs, wherein the intact NPPFs are indirect proportion to how much NPPF had been hybridized to the target. Insome examples, the hybridized target nucleic acid and CFSs can bedegraded, e.g., by nucleases or by chemical treatments. Alternatively,the sample can be treated so as to leave the (single strand) hybridizedportion of the target nucleic acid molecules, or the duplex formed bythe hybridized target nucleic acid molecules and CPSs to the NPPF, to befurther analyzed (for example the target hybridized to the NPPF can besequenced). In one example, the pH increased to about pH 8, and thereaction is incubated at 95° C. for 10 minutes and the causing thetarget nucleic acid and the CFSs to dissociate (and if the targetnucleic acid is RNA, hydrolyzing said target nucleic acids).

As shown in step 4 in FIG. 2, either after step 2 or step 3, the NPPFsare amplified, for example using PCR. FIG. 2 shows the PCR primers orprobes 208 as arrows. The PCR primers or probes can include a label,such as biotin, thereby resulting in the production of amplicons thatare labeled. At least a portion of the PCR primers/probes 208 arespecific for the flanking sequences of the NPPFs 202. The resultingamplicons 210 can then be detected, for example by binding to an array(see FIG. 3) or sequenced (see FIG. 4). In some examples, theconcentration of the primers 208 are in excess of the CPSs 204, forexample in excess by at least 10,000-fold, at least 50,000-fold, atleast 100,000-fold, at least 150,000-fold, at least 200,000-fold, or atleast 400,000-fold. In some examples, the concentration of primers 208in the reaction is at least 200 nM (such as at least 400 nM, at least500 nM, or at least 1000 nM), and the concentration of CPSs 204 in thereaction is less than 1 pM, is less than 0.5 pM, or is less than 0.1 pM.

As shown in step 5 in FIG. 3, the amplicons 210, which are the amplifiedNPPFs, can be contacted with a surface 212 including multiple spatiallydiscrete regions. Two different versions are shown. In one example(top), the surface includes at least one anchor 214 in association witha bifunctional linker 216. In some examples the amplicons 210 are addedto a 2× buffer prior to contact with the surface 212. The bifunctionallinker 216 includes a first portion which specifically binds to theanchor and a second portion which specifically binds to one of theplurality of NPPF amplicons 210. The amplicons 210 are incubated withthe surface 212 under conditions sufficient for each of the plurality ofNPPF amplicons 210 to specifically bind to the second portion of abifunctional linker 216. As shown in FIG. 1, the region of the NPPF 102that specifically binds to a bifunctional linker is complementary insequence to the bifunctional linker (and is also complementary to thetarget nucleic acid sequence). The NPPF amplicons 210 bound to thesecond portion of the bifunctional linker 216 are detected utilizing thedetectable label included in the NPPF amplicons 210, thereby detectingthe target nucleic acid in the sample.

In other example (bottom), the surface includes at least one nucleicacid capture molecule 220, which can be directly attached to the surfacethrough a covalent bond. In some examples the amplicons 210 are added toa 2× buffer prior to contact with the surface 212. The nucleic acidcapture molecule 220 includes a sequence that is complementary to aleast a portion of one of the plurality of NPPF amplicons 210, such asat least a portion of a flanking sequence region of the NPPF (or aregion added to the flanking sequence during amplification for example).The amplicons 210 are incubated with the surface 212 under conditionssufficient for each of the plurality of NPPF amplicons 210 tospecifically bind to the nucleic acid capture molecule 220.

The NPPF amplicons 210 bound to the nucleic acid capture molecule 220are detected utilizing the detectable label included in the NPPFamplicons 210, thereby detecting the target nucleic acid in the sample.For example, the NPPF amplicons can be incubated with the surfaceovernight at 50° C. to allow binding of the NPPF amplicons to thenucleic acid capture molecule 220. In one example, the NPPF ampliconsare labeled with biotin. As shown in step 6 of FIG. 3, the biotin can bedetected using avidin-HRP 218 (for example incubating with theavidin-HRP for 1 hour at 37° C.). As shown in step 7 of FIG. 3, excessunbound avidin-HRP 218 is removed, an appropriate substrate is added,and the surface imaged to detect the bound NPPFs. Although biotin isshown as an example, one skilled in the art will appreciate that otherdetection methods can be used, for example by detecting a fluorophore orantibody on the NPPF amplicons.

In some examples, if the NPPF amplicons 210 are not labeled (for exampleno label is added during amplification in step 4 of FIG. 2), the NPPFamplicons 210 can include a region (such as the flanking sequence orportion thereof) that is complementary to the sequence of a labeledprobe (wherein this region is not complementary to the bifunctionallinker 216). This complementary probe can then be hybridized to the NPPFamplicons 210 prior to attaching them to a substrate as shown in step 5of FIG. 3.

In some examples, the NPPF amplicons are contacted with a plurality ofsurfaces (such as a population of beads or other particles). In oneexample, each surface (such as each bead or sub-population of beadswithin a mixed bead population) includes at least one anchor inassociation with a bifunctional linker including a first portion whichspecifically binds to the anchor and a second portion which specificallybinds to one of the plurality of NPPF amplicons, under conditionssufficient for each of the plurality of NPPF amplicons to specificallybind to the second portion of a bifunctional linker. The NPPF ampliconsbound to the second portion of the bifunctional linker can be detectedutilizing the detectable label that is associated with the NPPFamplicons, thereby detecting the target nucleic acid molecule in thesample. In another example, each surface (such as each bead orsub-population of beads within a mixed bead population) includes atleast one nucleic acid capture molecule having a sequence complementaryto a least a portion of the NPPF amplicons (such as a flanking sequenceor portion thereof), under conditions sufficient for each of theplurality of NPPF amplicons to specifically bind to the nucleic acidcapture molecule. The NPPF amplicons bound to the nucleic acid capturemolecule can be detected utilizing the detectable label that isassociated with the NPPF amplicons, thereby detecting the target nucleicacid molecule in the sample.

As shown in step 5 in FIG. 4, the amplicons 210, which are the amplifiedNPPFs, can be sequenced. For example, one or both of the flankingsequences of the amplified NPPFs can include (or have added thereto) asequence adapter, or a primer that is complementary to and is hybridizedto the flanking sequence, can include a sequence adapter sequence, whichis complementary to capture sequences on the sequencing chip, andpermits sequencing of the NPPF using a particular sequencing platform.In some examples, a plurality of NPPFs are sequenced in parallel, forexample simultaneously or contemporaneously. This method can thus beused to sequence a plurality of NPPF sequences.

FIGS. 5A and 5B are schematic diagrams providing a further a summary ofthe method, with more details of the nucleic acid molecules. As shown inthe left panel of FIG. 5A, target nucleic acids 400 in a sample (such asa sample that has been treated with a sample disruption buffer) iscontacted or incubated with a plurality of nuclease protection probeshaving one or more flanking sequences (NPPFs) 402 (wherein each NPPF isspecific for a particular target nucleic acid 400), and with nucleicacid molecules that are complementary to the flanking sequences (CFS)406, that are specific for the flanking sequences 404 on the ends of theNPPFs. Three different target nucleic acids 400 are shown: one copy oftarget 1 (green) two copies of target 2 (red), and three copies oftarget 3 (blue). This example shows equal amounts of each NPPF 402 areadded. Although FIG. 5A shows NPPFs with flanking sequences on both endsof the NPP; one skilled in the art will appreciate that a singleflanking sequence can be used. The middle panel of FIG. 5A shows thereaction products after allowing the binding/hybridization reactions tooccur between the target nucleic acids 400, NPPFs 402, and CFSs 406. Thetarget nucleic acids 400 hybridize to a central region of the NPPFs, andthe CFSs 406 hybridize to the 3′- and 5′-flanking sequences 404. Theright panel of FIG. 5A shows the reaction products after the sample iscontacted with a nuclease specific for single-stranded (ss) nucleic acidmolecules under conditions sufficient to remove (or digest) ss nucleicacid molecules. As shown, regions of the target nucleic acids that didnot hybridize to an NPPF 408 are digested away, as are ss regions ofNPPFs that did not bind to a target nucleic acid or a CFS (e.g., 410).This leaves intact double-stranded nucleic acid molecules, includingNPPFs that have bound thereto and CFSs and target nucleic acid molecule(e.g., 412) and well as regions of the NPPF that hybridized to targetonly (but not CFS), or that hybridized to CFS only (but not target)(e.g., 414).

The left panel of FIG. 5B shows the reaction products after separatingthe double-stranded nucleic acid molecules (for example using heat andincreasing the pH). The resulting NPPFs that survive, which are indirect proportion to the target nucleic acid molecules that protectedthem during the nuclease step, can then be amplified. The middle panelof FIG. 5B shows the reaction products after they are amplified. Theright panel of FIG. 5B shows that after amplification, the resultingNPPF amplicons can be detected or sequenced (e.g., see FIGS. 2-4).

In some embodiments, the methods can include contacting a sample from asubject (such as a sample including nucleic acids, such as DNAs or RNAs)with plurality of NPPFs including at least one NPPF which specificallybinds to a first target (such as a first RNA) and optionally at leastone NPPF which specifically binds to a second target (such as a secondRNA). In some examples, the plurality of NPPFs includes more than one(such as 2, 3, 4, 5, or more) NPPFs specific for a single target nucleicacid molecule. For example, the plurality of NPPFs can include at leastone NPPF (such as at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, 75, 100,200, 300, 500, 1000, 2000, 3000, or more), wherein each NPPFspecifically binds to a single target nucleic acid molecule. In anotheror additional example, the plurality of NPPFs include at least twodifferent NPPF populations (such as 2, 3, 4, 5, 10, 20, or 50 differentNPPF sequences), wherein each NPPF population (or sequence) specificallybinds to a different target nucleic acid molecule.

In some examples, several NPPFs hybridize to different portions of thesame target nucleic acid, and the number of NPPFs hybridizing todifferent portions of each target nucleic acid can be the same ordifferent. For example, a low expressed nucleic acid target may havemore NPPFs that hybridize to it relative to a nucleic acid targetexpressed at a higher level, such as four NPPFs hybridizing to a lowexpressed nucleic acid target and a single NPPF hybridizing to a highexpressed nucleic acid target. In some examples, some of the NPPFsspecific for some target nucleic acids may not have flanking sequences(e.g., NPPs), and thus may not be amplified, or labeled, or have theappropriate adapters attached, and thus this portion of NPPFs will notbe detected or sequenced. Using such a mixture, which can be about 1 to5, or about 1 to 10, or about 1 to 100, or about 1 to 1,000 NPPFs withflanking sequence to NPPs without flanking sequence, the signalmeasured, or the number of NPPFs sequenced, can be “attenuated”, suchthat if there are 10,000 copies of target nucleic acid, and a ratio of 1to 5 is used, then after amplification only ⅕^(th) the number of NPPFswill be sequenced as would have been sequenced had every NPPF containedflanking sequences.

In some examples, the plurality of NPPFs include at least 2, at least 5,at least 10, at least 20, at least 100 or at least 1000 (such as 2 to5000, 2 to 3000, 10 to 1000, 50 to 500, 25 to 300, 50 to 300, 10 to 100,or 50 to 100) unique NPPF sequences. The plurality of NPPs can includeany combination of NPPFs specific for one or more target nucleic acidmolecules The plurality of NPPFs, along with the CFSs, are incubatedwith the sample under conditions sufficient for the NPPFs tospecifically hybridize to their respective target nucleic acids andtheir respective CFSs. In some examples, the CFSs are added in excess ofthe NPPFs, such as an at least 2-fold, at least 3-fold, at least 4-fold,at least 5-fold or at least 10-fold molar excess of CFS to NPPF. In someexamples, the NPPFs are added in excess of the nucleic acid molecules inthe sample, such as an at least 10-fold, at least 50-fold, at least75-fold, at least 100-fold, at least 250-fold, at least 1,000 fold, atleast 10,000 fold, or at least 100,000 fold molar excess or more of NPPFto nucleic acid molecules in the sample. It will be appreciated that ifthe NPPF for a highly abundant nucleic acid target is in excess of 1,000fold, and the same concentration of each different NPPF is the same,then the excess of NPPF for a low abundant gene can be many timesgreater, such as 1,000 times greater for a gene that is 1,000 fold lowerabundance than the high abundant nucleic acid target.

The hybridized sample can then be contacted with a nuclease specific forsingle-stranded nucleic acids (for example, S1 nuclease). The resultingNPPFs that survive, which are in direct proportion to the target nucleicacid molecules that protected them during the nuclease step, can then beamplified. For example, amplification primers that include a sequencecomplementary to the flanking sequence of the NPPF can be used. Theresulting NPPF amplicons can then be detected by methods known in theart, for example by binding them to an array or other substrate, orsequenced. The target nucleic acid molecule(s) are identified as presentin the sample when their respective NPPF is detected or sequenced.

A. Exemplary Hybridization Conditions

Disclosed herein are conditions sufficient for a plurality of NPPFs tospecifically hybridize to target nucleic acid molecule(s), such as DNAsand RNAs present in a sample from a subject, as well as specificallyhybridize to CFS complementary to the flanking sequence(s). For example,the features (such as length, base composition, and degree ofcomplementarity) that will enable a nucleic acid (e.g., an NPPF) tohybridize to another nucleic acid (e.g., a target DNA or target RNA orCFS) under conditions of selected stringency, while minimizingnon-specific hybridization to other substances or molecules can bedetermined based on the present disclosure. Characteristics of the NPPFsare discussed in more detail in Section IV, below. Typically, a regionof an NPPF will have a nucleic acid sequence (e.g., FIG. 1, 102) that isof sufficient complementarity to its corresponding target nucleic acidmolecule to enable it to hybridize under selected stringenthybridization conditions, as well as a region (e.g., FIG. 1, 104, 106)that is of sufficient complementarity to its corresponding CFS to enableit to hybridize under selected stringent hybridization conditions.Exemplary hybridization conditions include hybridization at about 37° C.or higher (such as about 37° C., 42° C., 50° C., 55° C., 60° C., 65° C.,70° C., 75° C., or higher). Among the hybridization reaction parameterswhich can be varied are salt concentration, buffer, pH, temperature,time of incubation, amount and type of denaturant such as formamide. Forexample, nucleic acid (e.g., a plurality of NPPFs) can be added to asample at a concentration ranging from about 10 pM to about 10 nM (suchas about 30 pM to 5 nM, about 100 pM to about 1 nM), in a buffer (suchas one containing NaCl, KCl, H₂PO₄, EDTA, 0.05% Triton X-100, orcombinations thereof) such as a lysis buffer.

In one example, each NPPF is added to the sample at a finalconcentration of at least 10 pM, such as at least 20 pM, at least 30 pM,at least 50 pM, at least 100 pM, at least 150 pM, at least 200 pM, atleast 500 pM, at least 1 nM, or at least 10 nM. In one example, eachNPPF is added to the sample at a final concentration of about 30 pM. Inanother example, each NPPF is added to the sample at a finalconcentration of about 167 pM. In a further example, each NPPF is addedto the sample at a final concentration of about 1 nM. In one example,each CFS is added to the sample at a final concentration of about atleast 6-times the amount of probe, such as at least 10-times or at least20-times the amount of probe (such as 6 to 20 times the amount ofprobe). In one example, each CFS is added at least 1 nM, at least 5 nM,at least 10 nM, at least 50 nM, at least 100 nM, or at least 200 nm,such as 1 to 100, 5 to 100 or 5 to 50 nM. For example if there are sixprobes, each at 166 pM, each CFSs can be added at 5 to 50 nM.

The nucleic acids in the sample are denatured, rendering them singlestranded and available for hybridization (for example at about 95° C. toabout 105° C. for about 5-15 minutes). By using different denaturationsolutions, this denaturation temperature can be modified, so long as thecombination of temperature and buffer composition leads to formation ofsingle stranded target DNA or RNA or both. The nucleic acids in thesample and the CFSs are then hybridized to the plurality of NPPFs forbetween about 10 minutes and about 72 hours (for example, at least about1 hour to 48 hours, about 6 hours to 24 hours, about 12 hours to 18hours, or overnight) at a temperature ranging from about 4° C. to about70° C. (for example, about 37° C. to about 65° C., about 42° C. to about60° C., or about 50° C. to about 60° C.). Of course the hybridizationconditions will vary depending on the particular NPPFs and CFSs used,but are set to ensure hybridization of NPPFs to the target molecules andthe CFSs. In some examples, the plurality of NPPFs and CFSs areincubated with the sample at a temperature of at least about 37° C., atleast about 40° C., at least about 45° C., at least about 50° C., atleast about 55° C., at least about 60° C., at least about 65° C., or atleast about 70° C. In one example, the plurality of NPPFs and CFSs areincubated with the sample at about 37° C., at about 42° C., or at about50° C.

In some embodiments, the methods do not include nucleic acidpurification (for example, nucleic acid purification is not performedprior to contacting the sample with the NPPFs and/or nucleic acidpurification is not performed following contacting the sample with theNPPFs). In some examples, no pre-processing of the sample is requiredexcept for cell lysis. In some examples, cell lysis and contacting thesample with the plurality of NPPFs and CFSs occur sequentially. In otherexamples, cell lysis and contacting the sample with the plurality ofNPPFs and CFSs occur concurrently, in some non-limiting examples withoutany intervening steps.

When the NPPFs are subsequently subjected to PCR (e.g., universalamplification or NPPF-specific amplification such as for real time PCR),the buffers and reagents used for lysis, hybridization of NPPFs to theirtarget nucleic acids, nuclease digestion, and base hydrolysis can becompatible with the polymerase used for amplification.

B. Treatment with Nuclease

Following hybridization of the NPPFs to target nucleic acids in thesample and to CFSs, the sample is subjected to a nuclease protectionprocedure. NPPFs which have hybridized to a target nucleic acid moleculeand (when used) CFS (one or two CFSs, depending if there are both 5′-and 3′-flanking sequence on the NPPF or just one, or no CFS whereflanking sequences are not required for amplification or measurement)are not hydrolyzed by the nuclease and can be subsequently amplified,and then detected or sequenced (or both).

Treatment with one or more nucleases will destroy all ss nucleic acidmolecules (including RNA and DNA in the sample that is not hybridized to(thus not protected by) NPPFs, NPPFs that are not hybridized to targetnucleic acid, and (when used) CFSs not hybridized to an NPPF), but willnot destroy ds nucleic acid molecules such as NPPFs which havehybridized to CFSs and a target nucleic acid molecule present in thesample. For example, if the sample includes a cellular extract orlysate, unwanted nucleic acids, such as non-target genomic DNA, tRNA,rRNA, mRNA, miRNA, and portions of the target nucleic acid molecule(s)that are not hybridized to complementary NPPF sequences (such asoverhangs), which in the case of mRNA or DNA nucleic acid targets willconstitute the majority of the nucleic target sequence, can besubstantially destroyed in this step. This leaves behind astoichiometric amount of target nucleic acid/CFS/NPPF duplex. If thetarget molecule is cross-linked to tissue that occurs from fixation, theNPPFs hybridize to the cross-linked target molecule without the need toreverse cross-linking, or otherwise release the target nucleic acid fromthe tissue to which it is cross-linked.

Conditions can be selected such that single nucleotide differencesleading to an unpaired base is not cleaved, or a nuclease can be usedwhich just cleaves unpaired bases up to the ends of the hybridizednuclease protection probe, such as an exonuclease. Conditions can alsobe selected which will hydrolyze the NPPF sequence at the point of asingle unpaired base, and similarly hydrolyze the target nucleic acid atthat position.

Examples of nucleases include endonucleases, exonuclease, andcombinations thereof. Any of a variety of nucleases can be used,including, DNAase, pancreatic RNAse, mung bean nuclease, S1 nuclease,RNAse A, Ribonuclease T1, Exonuclease III, Exonuclease VII, RNAse CLB,RNAse PhyM, RNAse U2, or the like, depending on the nature of thehybridized complexes and of the remainder of nucleic acids andnon-target nucleic acid sequences present in the sample. One of skill inthe art can select an appropriate nuclease. In a particular example, thenuclease is specific for single-stranded (ss) nucleic acids, for exampleS1 nuclease. One advantage of using a nuclease specific for ss nucleicacids, in addition to hydrolyzing excess NPPFs and conferring thestoichiometry of target nucleic acid to the NPPFs, is to remove suchsingle-stranded (“sticky”) molecules from subsequent reaction stepswhere they may lead to undesirable background or cross-reactivity.However, one skilled in the art will appreciate that if the targetnucleic acid is to be sequenced, this is not necessary, as only theNPPFs with the appropriate sequencing adapters will hybridize to thesequencing chips, at which point the ss molecules from the sample can bewashed away. S1 nuclease is commercially available from for example,Promega, Madison, Wis. (cat. no. M5761); Life Technologies/Invitrogen,Carlsbad, Calif. (cat. no. 18001-016); Fermentas, Glen Burnie, Md. (cat.no. EN0321), and others. Reaction conditions for these enzymes arewell-known in the art and can be optimized empirically.

In some examples, S1 nuclease diluted in a buffer (such as onecontaining sodium acetate NaCl, KCl, ZnSO₄, KATHON, or combinationsthereof) is added to the hybridized probe/sample mixture and incubatedat about 37° C. to about 60° C. (such as about 50° C.) for 10-120minutes (for example, 10-30 minutes, 30 to 60 minutes, 60-90 minutes, or120 minutes) to digest non-hybridized nucleic acid from the sample andnon-hybridized NPPFs.

The samples can optionally be treated to otherwise remove non-hybridizedmaterial and/or to inactivate or remove residual enzymes (e.g., byheating, phenol extraction, precipitation, column filtration, additionof proteinase k, addition of a nuclease inhibitor, chelating divalentcations required by the nuclease for activity, or combinations thereof).In some examples, the samples are optionally treated to dissociate thetarget nucleic acid and the CFS(s) from its complementary NPPF (e.g.,using base hydrolysis and heat). In some examples, after hybridizationand nuclease treatment, the target RNA molecule hybridized to the NPPFcan be degraded, e.g., by dissociating the duplex with NPPF in base andthen destroying the RNA by nucleases or by chemical/physical treatments,such as base hydrolysis at elevated temperature, leaving the NPPF indirect proportion to how much had been hybridized to target nucleicacid. Alternatively, the sample can be treated so as to leave the(single strand) hybridized portion of the target nucleic acid, or theduplex formed by the hybridized target nucleic acid and the probe, to befurther analyzed.

In some examples following incubation with a nuclease, base (such asNaOH or KOH) is added to increase the pH to about 9 to 12 and the sampleheated (for example to 95° C. for 10 minutes). This dissociates thetarget molecule/CFS/NPPFs dimers, leaving the NPPF in a single strandedstate, and in the case of RNA, hydrolyzes the RNA target molecules. Thisstep can also neutralize or deactivate the nuclease, such as by raisingthe pH above about 6.

In some examples the sample is treated to adjust the pH to about 7 toabout 8, for example by addition of acid (such as HCl). In some examplesthe pH is raised to about 7 to about 8 in Tris buffer. Raising the pHcan prevent the depurination of DNA and also prevents many ss-specificnucleases (e.g., S1) from functioning fully.

In some examples, the sample is purified or separated to removeundesired nucleic acid or other molecules, prior to amplification, forexample by gel purification or other separation method.

C. Amplification

The resulting NPPF molecules (or resulting target nucleic acid moleculesthat have been separated from the NPPF), which are in direct proportionto how much target nucleic acid molecules were present in the sampletested, can be amplified, for example using routine methods such as PCRor other forms of enzymatic amplification or ligation based methods ofamplification.

Examples of in vitro amplification methods that can be used include, butare not limited to, quantitative real-time PCR, strand displacementamplification (see U.S. Pat. No. 5,744,311); transcription-freeisothermal amplification (see U.S. Pat. No. 6,033,881); repair chainreaction amplification (see WO 90/01069); ligase chain reactionamplification (see EP-A-320 308); gap filling ligase chain reactionamplification (see U.S. Pat. No. 5,427,930); coupled ligase detectionand PCR (see U.S. Pat. No. 6,027,889); and NASBA™ RNA transcription-freeamplification (see U.S. Pat. No. 6,025,134). In one example, aligation-based method of amplification is used, wherein the primers areNPPF specific and butt-up together so that they can be ligated together,melted off, and then fresh primers ligated together for a series ofcycles. Ligation can be enzymatic or non-enzymatic. If the NPPF flankingsequences are used for hybridization of the primers, the amplificationcan be universal.

Quantitative real-time PCR is another form of in vitro amplifyingnucleic acid molecules, enabled by Applied Biosystems (TaqMan PCR). The5′ nuclease assay provides a real-time method for detecting onlyspecific amplification products. During amplification, annealing of theprobe to its target sequence generates a substrate that is cleaved bythe 5′ nuclease activity of Taq DNA polymerase when the enzyme extendsfrom an upstream primer into the region of the probe. This dependence onpolymerization ensures that cleavage of the probe occurs only if thetarget sequence is being amplified. The use of fluorogenic probes makesit possible to eliminate post-PCR processing for the analysis of probedegradation. The probe is an oligonucleotide with both a reporterfluorescent dye and a quencher dye attached. While the probe is intact,the proximity of the quencher greatly reduces the fluorescence emittedby the reporter dye by Förster resonance energy transfer (FRET) throughspace. For real time PCR, the sample of NPPFs can be divided intoseparate wells or reaction locations, and a different NPPF-specific setof primers is added to each well or reaction location. Using probes(each having a different label permits multiplexing of real time PCR tomeasure multiple different NPPFs within a single well, or reactionlocation.

During amplification of the NPPF, an experiment tag, and/or sequencingadapter can be incorporated as, for instance, part of the primer andextension constructs, for example at the 3′- or 5′-end or at both ends.For example, an amplification primer, which includes a first portionthat is complementary to all or part of NPPF flanking sequence, caninclude a second portion that is complementary to a desired experimenttag and/or sequencing adapter. One skilled in the art will appreciatethat different combinations of experiment tags and/or sequencingadapters can be added to either end of the NPPF. In one example, theNPPF is amplified using a first amplification primer that includes afirst portion complementary to all or a portion of the 3′-NPPF flankingsequence and a second portion complementary to (or comprising) a desiredsequencing adapter, and the second amplification primer includes a firstportion complementary to all or a portion of the 5′-NPPF flankingsequence and a second portion complementary to (or comprising) a desiredexperiment tag. In another example, the NPPF is amplified using a firstamplification primer that includes all or a portion of a first portioncomplementary to the 3′-NPPF flanking sequence and a second portioncomplementary to (or comprising) a desired sequencing adapter and adesired experiment tag, and the second amplification primer includes afirst portion complementary to all or a portion of the 5′-NPPF flankingsequence and a second portion complementary to (or comprising) a desiredexperiment tag.

It will be appreciated that NPPF-specific primers can be used to addsequencing adapters, experiment tags (including tags that permit captureof an NPPF by a substrate), and NPPF tags. The sample of NPPFs can beseparated into separate wells or locations containing one or moredifferent NPPF-specific primers, amplified, and then either sequencedseparately or combined for sequencing (or detected).

Amplification can also be used to introduce a detectable label into thegenerated NPPF amplicons (for example if the NPPF was originallyunlabeled or if additional labeling is desired), or other molecule thatpermits detection or quenching. For example, the amplification primercan include a detectable label, haptan, or quencher which isincorporated into the NPPF during amplification. Such a label, haptan,or quencher can be introduced at either end of the NPPF amplicon (orboth ends), or anywhere in between.

In some examples, the resulting NPPF amplicons are cleaned up beforedetection or sequencing. For example, the amplification reaction mixturecan be cleaned up before detection or sequencing using methods wellknown in the art (e.g., gel purification, biotin/avidin capture andrelease, capillary electrophoresis). In one example, the NPPF ampliconsare biotinylated (or include another haptan) and captured onto an avidinor anti-haptan coated bead or surface, washed, and then released fordetection or sequencing. Likewise, the NPPF amplicons can be capturedonto a complimentary oligonucleotide (such as one bound to a surface),washed and then released for detection or sequencing. The capture ofamplicons need not be particularly specific, as the disclosed methodseliminate most of the genome or transcriptome, leaving behind the NPPFthat had been hybridized to target nucleic acid molecule. Other methodscan be used to clean up the amplified product, if desired.

The amplified products can also be cleaned up after the last step ofamplification, while still double stranded, by a method which uses anuclease that hydrolyzes single stranded oligonucleotides (such asExonuclease I), which nuclease can in turn be inactivated beforecontinuing to the next step such as hybridization to a surface.

D. Detection of NPPF Amplicons

In some examples, the resulting amplicons are detected by any suitablemeans, for example based upon the detectable label present on the NPPFamplicons. In a specific, non-limiting example, the NPPF ampliconsinclude a biotin label. In this example, the NPPF amplicons can bedetected by incubating the amplicons (such as on a support, e.g., arrayor bead, containing the NPPF amplicons) with avidin-HRP,strepavidin-HRP, or a conjugate with another suitable enzyme such asalkaline phosphatase, and then contacting the support with chromogenic-,chemiluminescence-, or fluorescence-generating substrate. In onenon-limiting example, the substrate is TMA-3 (Lumigen, Southfield,Mich.). Additional chemiluminescent substrates are commerciallyavailable, such as LumiGlo® (KPL, Gaithersburg, Md.), SuperSignal®(Pierce, Rockford, Ill.), and ECL™ (Amersham/GE Healthcare, Piscataway,N.J.). Signal produced by the substrate is detected, for exampleutilizing a microarray imager (such as an OMIX, OMIX HD, CAPELLA, orSUPERCAPELLA imager, HTG Molecular Diagnostics, Tucson, Ariz.) scanner,or visually such as in a lateral flow device. Europium-basedluminescence can be used, as well as electroluminescence or lightscatter, or electrical (e.g., conductivity or resistance). In anotherexample, the NPPFs include a fluorescent label, such as Cy-3 or Cy-5.The NPPF amplicons can be detected utilizing a standard microarrayimager (such as a Typhoon™ imager (GE Life Sciences, Piscataway, N.J.),a GenePix® microarray scanner (Molecular Devices, Sunnyvale, Calif.),GeneChip® scanner (Affymetrix, Santa Clara, Calif.), flow cytometrymethods, or fluorescent microscopy methods. One of skill in the art canselect suitable detection methods and reagents for these or otherdetectable labels.

E. Detection of NPPFs Utilizing Capture Molecules

In some embodiments, following hybridization, nuclease treatment, andamplification, the sample containing NPPF amplicons is contacted with asurface that includes multiple spatially discrete regions, eachincluding a capture molecule, or is contacted with a plurality ofsurfaces, each including a capture molecule. For example, the surfacecan be a population of beads, wherein subpopulations of the beads eachinclude at least one capture molecule. For example a first subpopulationcould include at least one capture molecule, while a secondsubpopulation could include at least one capture molecule having adifferent sequence than the first, and so on. In some examples, thecapture molecule includes at least one anchor associated with abifunctional linker (also referred to as a “programming linker”).Alternatively, the capture molecule includes a nucleic acid captureprobe, having a sequence that is complementary to at least a portion ofan NPPF amplicon, such as complementary to all or a portion of aflanking region of an NPPF amplicon.

In an example where the capture molecule includes at least one anchorassociated with a bifunctional linker, the anchor and the bifunctionallinker are associated by hybridization, annealing, covalent linkage, orother binding. The bifunctional linker includes a first portion whichspecifically binds to (for example, is complementary to) the anchor anda second portion which specifically binds to (for example, iscomplementary to) one of the plurality of NPPF amplicons (such ascomplementary to all or a portion of region 102 of the NPPF 100 shown inFIG. 1)

In some embodiments, the disclosed methods include an anchor on asurface (for example on an array), which is associated with abifunctional linker which is utilized to capture the NPPF ampliconsfollowing the amplification step. In some examples, an anchor is anoligonucleotide of about 8 to 150 nucleotides in length (for example,about 8 to 100, 15 to 100, 20 to 80, 25 to 75, or 25 to 50, such asabout 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90,95, 100, 110, 120, 130, 140, or 150 nucleotides). In one non-limitingexample, the anchor is about 25 nucleotides in length. In some examples,the anchor includes a first portion that specifically binds to the firstportion of the bifunctional linker and a second portion that acts as aspacer between the surface and the first portion of the anchor. In someexamples, the second portion of the anchor is about 6 to 60 carbon atomsor nucleotides in length (such as about 6, 12, 24, 30, 36, 42, 48, 54,or 60 carbon atoms or nucleotides). In other examples, the secondportion of the anchor is about 5 to 100 carbon atoms or nucleotides inlength (such as about 10 to 50, 15 to 40, 20 to 30, or about 25 carbonatoms or nucleotides).

The base composition for anchors for the disclosed methods is such thatthe thermodynamic stability of the anchor and bifunctional linkerpairing is high. In some examples, the percentage base composition forthe anchors is about 30-40% G, 30-40% C, 10-20% A, and 10-20% T. In someexamples, nearest neighbor frequency in the anchors minimizes G-G or C-Cnearest neighbors to reduce side reactions mediated via G-quartetformation. In other examples, unnatural bases, or peptide nucleic acids,can be incorporated in the anchor or the bifunctional linker to modifyits properties.

Methods of designing and synthesizing anchors of use in the disclosedmethods are described, e.g., in PCT Publication No. WO 98/24098,incorporated herein by reference. In some examples, a set of anchorswhich are substantially dissimilar from one other is desirable. Anexemplary algorithm for obtaining a set of dissimilar anchors is asfollows:

1) The set size is defined. In some embodiments, 16, 24, 36, 48, 49, 64,81, 96, and 100 constitute useful sizes.

2) The overall sequence structure of the anchor set is defined. Thelength and base composition as described above are used to define suchparameters. In general, the number of G bases and C bases are held equalas are the number of A bases and T bases. This equality optimizes theconfigurational diversity of the final sets. Thus, such sets will bedescribed by the equation G_(n)C_(n)A_(m)T_(m).

3) For a set structure defined by m and n, a random number generator isemployed to produce a set of random sequence isomers.

4) One member of the random sequence set is selected to be used aselement #1 of the set.

5) The maximum similarity allowable among set members is defined.Similarity is defined in terms of local pair-wise base comparison. Forexample, when two oligomer strands of identical length n are alignedsuch that 5′ and 3′ ends are in register, the lack of mismatches refersto the situation where at all positions 1-n, bases in the two strandsare identical. Complete mismatching refers to the situation wherein atall positions 1-n, bases in the two strands are different. For example,a useful maximum similarity might be 10 or more mismatches within a setof 16, 16mer capture probes.

6) A second member of the random sequence set is selected and itssimilarity to element #1 is determined. If element #2 possesses lessthan the maximum allowable similarity to element #1, it will be kept inthe set. If element #2 possesses greater than the maximum allowablesimilarity, it is discarded and a new sequence is chosen for comparison.This process is repeated until a second element has been determined.

7) In a sequential manner, additional members of the random sequence setare chosen which satisfy the dissimilarity constraints with respect toall previously selected elements.

One non-limiting example of a set of 16 anchors which can be utilized inthe disclosed methods is shown in Table 1.

TABLE 1 Exemplary anchor sequences Anchor Sequence (5′ → 3′) SEQ ID NO:TGATTCAGACCGGCCG  1 CCCGGGGCGTCTTAAC  2 GGACGCCATATGCGCT  3TGAGGGCTCCGCCATA  4 AACCCGTGACGTGTGC  5 AGCATCGCCGGTCCTG  6CCTGCAAGGCTGACGT  7 CAGTTGTCGACCCCGG  8 CGGCGCGTCCAATTCG  9ATCGATCTGAGGGCCC 10 GTACATGCGGCCTGCA 11 TAGCCGCTCGCTAGAG 12CCTAGTGATGACCGGC 13 GTCTGAGGGCAACCTC 14 CTAGCTGGCTACGCAG 15GCCATCCGCTTGGAGC 16

In other examples where the capture molecule includes at least onenucleic acid capture probe, having a sequence that is complementary toat least a portion of an NPPF amplicon, such as complementary to all ora portion of a flanking region of an NPPF amplicon. For example, thenucleic acid capture probe can include a region that is complementary tothe NPPF amplicon, and may include a region that is not (such as aregion that permits attachment of the probe to a surface). The nucleicacid capture probe can be directly attached to a surface. For example,the nucleic acid capture probe can include an amine for covalentattachment to a surface. In some examples, an nucleic acid capture probeis an oligonucleotide of at least 8 nucleotides in length, such as atleast 10, at least 15, at least 20, at least 30, at least 50, or atleast 100 nucleotides in length (for example, about 8 to 100, 15 to 100,20 to 80, 25 to 75, or 25 to 50, such as about 15, 20, 25, 30, 35, 40,45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, or150 nucleotides). One skilled in the art will appreciate that the regionof the nucleic acid capture probe complementary to a region of the NPPFamplicon need not be 100% complementary, as long as hybridization canoccur between the nucleic acid capture probe and appropriate NPPFamplicons. In some examples, the region of the nucleic acid captureprobe complementary to a region of the NPPF amplicon is at least 8nucleotides in length, such as at least 8, at least 10, at least 15, atleast 20, at least 30, at least 50, or at least 100 nucleotides inlength (for example, about 8 to 100, 15 to 100, 20 to 80, 25 to 75, or25 to 50, such as about 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70,75, 80, 85, 90, 95, 100, 110, 120, 130, 140, or 150 nucleotides inlength).

In some examples, the sample containing NPPF amplicons is denaturedprior to contacting with the surface of the array (for example byheating to 95° C. for 5 minutes and rapidly chilling the sample on ice).In some examples, the sample containing NPPFs is adjusted prior tocontacting with the surface (for example to adjust the concentration ofsalt or formamide). The sample containing NPPF amplicons is incubatedwith the surface (for example, an array or beads) for a sufficientperiod of time for the NPPF amplicons to specifically bind (for example,hybridize) to the capture molecule. In some examples, the incubation ofthe sample with the surface at about 37° C. to about 65° C. (forexample, about 45° C. to about 60° C., or about 50° C. to about 60° C.,such as 50° C.) for at least 1 hours (for example 1 to 8 hours, 1 to 36hours, 12 to 24 hours, or 16 to 24 hours, or overnight) to allow theNPPF amplicons to hybridize to the capture molecule (“NPPF capture”).The capture time can be shorted, for example if using microfluidic ormacrofluidic devices, lateral flow devices, or by reducing diffusion andusing active flow or mixing.

Some of the surfaces (or substrates) which can be used in the disclosedmethods are readily available from commercial suppliers. In someembodiments, the surface is a 96-, 384-, or 1536-well microtiter plate,such as modified plates sold by Corning Costar. In other embodiments, asubstrate includes one or more beads (such as a population of beads thatcan be differentiated by size or color, for example by flow cytometry).Alternatively, a surface comprising wells which, in turn, compriseindentations or “dimples” can be formed by micromachining a substancesuch as aluminum or steel to prepare a mold, then microinjecting plasticor a similar material into the mold to form a structure. Alternatively,a structure comprised of glass, plastic, ceramic, or the like, can beassembled. The separator can be, for example, a piece of material, e.g.,silicone, with holes spaced throughout, so that each hole will form thewalls of a test well when the three pieces are joined. The subdividercan be, for example, a thin piece of material, e.g., silicone, shaped inthe form of a screen or fine meshwork. The divider on the surfaceseparating different reactions can also be a coated surface to whichsolutions will not adhere, or a nanostructure, or simply be individualdrops, or capillaries or microfluidic channels or locations. In someexamples, the base is a flat piece of material (for example glass orplastic), in, for example, the shape of the lower portion of a typicalmicroplate used for a biochemical assay. The top surface of the base canbe flat, or can be formed with indentations that will align with thesubdivider shape to provide full subdivisions, or wells, within eachsample well. The three pieces can be joined by standard procedures, forexample the procedures used in the assembly of silicon wafers.

Suitable materials for the surface include, but are not limited to:glass, silica, gold, silver, a gel or polymer, nitrocellulose,polypropylene, polyethylene, polybutylene, polyisobutylene,polybutadiene, polyisoprene, polyvinylpyrrolidine,polytetrafluoroethylene, polyvinylidene difluoroide,polyfluoroethylene-propylene, polyethylenevinyl alcohol,polymethylpentene, polycholorotrifluoroethylene, polysulformes,hydroxylated biaxially oriented polypropylene, aminated biaxiallyoriented polypropylene, thiolated biaxially oriented polypropylene,ethyleneacrylic acid, thylene methacrylic acid, and blends of copolymersthereof (see U.S. Pat. No. 5,985,567), or comprised of nanomaterialsincluding carbon.

In general, suitable characteristics of the material that can be used toform the surface include: being amenable to surface activation such thatupon activation, the surface of the support is capable of covalentlyattaching a biomolecule such as an oligonucleotide (e.g., anchor)thereto; amenability to “in situ” synthesis of biomolecules; beingchemically inert such that at the areas on the support not occupied byoligonucleotides or proteins are not amenable to non-specific binding,or when non-specific binding occurs, such materials can be readilyremoved from the surface without removing the oligonucleotides orproteins. The surfaces can be permeable, partially permeable, orimpermeable.

A wide variety of array formats for arrangement of the anchors can beemployed in accordance with the present disclosure. One suitable formatincludes a two-dimensional pattern of discrete cells (such as 4096squares in a 64 by 64 array). As is appreciated by those skilled in theart, other array formats including, but not limited to slot(rectangular) and circular arrays are equally suitable for use (see U.S.Pat. No. 5,981,185). In some examples, the array is a multi-well plate.

Oligonucleotide anchors, bifunctional linkers, and other capturemolecules (as well as NPPFs, CFSs, and PCR probes/primers) can besynthesized by conventional technology, for example, with a commercialoligonucleotide synthesizer and/or by ligating together subfragmentsthat have been so synthesized. Nucleic acids which are too long to bereliably synthesized by such methods can be generated by amplificationprocedures, using conventional procedures.

In one embodiment, preformed nucleic acid anchors (e.g., oligonucleotideanchors) or nucleic acid capture probes having a sequence complementaryto at least a portion of an NPPF amplicon (e.g., oligonucleotideprobes), can be situated on or within the surface of a test region byany of a variety of conventional techniques, including photolithographicor silkscreen chemical attachment, disposition by ink jet technology,capillary, screen or fluid channel chip, electrochemical patterningusing electrode arrays, contacting with a pin or quill, or denaturationfollowed by baking or UV-irradiating onto filters (see, e.g., Rava etal. (1996). U.S. Pat. No. 5,545,531; Fodor et al. (1996). U.S. Pat. No.5,510,270; Zanzucchi et al. (1997). U.S. Pat. No. 5,643,738; Brennan(1995). U.S. Pat. No. 5,474,796; PCT WO 92/10092; PCT WO 90/15070).Oligonucleotide anchors or probes can be placed on top of the surface ofa test region or can be, for example in the case of a polyacrylamide gelpad, imbedded within the surface in such a manner that some of theanchor or probe protrudes from the gel structure into aqueous portionswithin the gel and gel surface and is available for interactions with alinker or NPPF. This is true for permeable surfaces and partiallypermeable surfaces, such as a surface where the first portion, such asthe area of the surface in contact with the solutions containingbifunctional linkers or NPPFs is permeable but a second portion, such asat some distance into the surface, is not permeable. In one embodiment,preformed oligonucleotide anchors or probes are derivatized at the 5′end with a free amino group; dissolved at a concentration routinelydetermined empirically (e.g., about 1 μM) in a buffer such as 50 mMphosphate buffer, pH 8.5 and 1 mM EDTA; and distributed with a Pixusnanojet dispenser (Cartesian Technologies) in droplets of about 10.4nanoliters onto specific locations within a test well whose uppersurface is that of a fresh, dry DNA Bind plate (Corning Costar).Depending on the relative rate of oligonucleotide attachment andevaporation, it may be required to control the humidity in the wellsduring preparation. In another embodiment, oligonucleotide anchors orprobes can be synthesized directly on the surface of a test region,using conventional methods such as, for example, light-activateddeprotection of growing oligonucleotide chains (for example, inconjunction with the use of a site directing “mask”) or by patterneddispensing of nanoliter droplets of deactivating compound using ananojet dispenser. Deprotection of all growing oligonucleotides that areto receive a single nucleotide can be done, for example, and thenucleotide then added across the surface. In another embodiment,oligonucleotide anchors or probes are attached to the surface via the 3′ends of the oligonucleotides, using conventional methodology.

F. Detection of NPPs Utilizing Alternative Methods

In some embodiments, following hybridization, nuclease treatment, andamplification, the NPPF amplicons are detected utilizing alternativemethods, such as high-throughput platforms. In some examples, NPPFamplicons are detected utilizing gel electrophoresis, chromatography,mass spectrometry, sequencing, conventional microarray analysis,detected during the PCR amplification step, or hybrid capture. In someembodiments, the NPPF amplicons do not include a detectable label andindirect detection methods are utilized. Such methods are known to oneof skill in the art and include, but are not limited to, those describedherein.

In one example, NPPF amplicons are detected utilizing a bead-basedassay, such as a bead array. One example of a bead-based assay utilizesX-MAP® beads (Luminex, Austin, Tex.), such as a QBEAD assay. In someexamples, the NPPs are captured on X-MAP® beads or other beads byhybridization to an oligonucleotide associated with the beads (forexample about 1 hour at about 50° C.). The detectable label included inthe NPPF amplicons can be detected, for example by flow cytometry (suchas utilizing a Luminex 200, Flexmap 3D, or other suitable instrument).

In another example, NPPF amplicons are detected utilizing a standardmicroarray. One example of such an array is a Nimblegen microarray(Nimblegen, Madison, Wis.). In some examples, the NPPF amplicons arehybridized to an array including oligonucleotides that specifically bindto the NPPF amplicons. The detectable label included in the NPPFamplicons can be detected.

In some examples, NPPF amplicons are detected with a “bar code” assay.One example of such as assay is nCounter® Analysis System (NanostringTechnologies, Seattle, Wash.). In some examples, the NPPF amplicons arehybridized to a probe including one or more color coded tags (a“bar-code”). Detection of the color coded tags provides identificationof the NPPF amplicon. See, e.g., WO 07/0761282; WO 07/076,129; WO07/139,766.

F. Sequencing of Amplicons

In some examples, the resulting NPPF amplicons are sequenced, forexample by sequencing the entire NPPF amplicon, or a portion thereof(such as an amount sufficient to permit identification of the targetnucleic acid molecule). The disclosure is not limited to a particularsequencing method. In some examples, multiple different NPPF ampliconsare sequenced in a single reaction. In one example, an experiment tag ofthe NPPF amplicon, which can be designed to correspond to a particulartarget sequence, can be sequenced. Thus, if the 3′ end of the NPPFamplicon has a sequence at the terminal 2 to 25 nucleotides (such as theterminal 2 to 5 or 2 to 7, for example the terminal 2, 3, 4, 5, 6, 7, 8,9, or 10 nucleotides) which represent a unique sequence for each targetmeasured, then this is all of the NPPF amplicon that needs to besequenced to identify the target, and by counting the number of suchexperiment tags sequenced, the amount of each target in the sample canbe determined.

In one example, the resulting NPPF amplicons, such as one composed ofDNA, is sequenced using the chain termination method. This techniqueuses sequence-specific termination of a DNA synthesis reaction usingmodified nucleotide substrates. In chain terminator sequencing,extension is initiated at a specific site on the NPPF amplicon by usingan oligonucleotide primer complementary to a portion of the NPPFamplicon. The oligonucleotide primer is extended using a polymerase,such as a RNA or DNA polymerase. Included with the primer and polymeraseare the four deoxynucleotide bases (or ribonucleotide), along with a lowconcentration of a chain terminating nucleotide (commonly adi-deoxynucleotide). Limited incorporation of the chain terminatingnucleotide by the polymerase results in a series of related nucleic acidfragments that are terminated only at positions where that particularnucleotide is used. The fragments are then size-separated, for exampleby electrophoresis in a slab polyacrylamide gel, or in a narrow glasstube (capillary) filled with a viscous polymer.

An alternative method is dye terminator sequencing. Using this approachpermits complete sequencing in a single reaction, rather than the fourneeded with the chain termination method. This is accomplished bylabeling each of the dideoxynucleotide chain-terminators with a separatefluorescent dye, which fluoresces at a different wavelength.

In another example pyrosequencing is used, such as the methodscommercialized by Biotage (for low throughput sequencing) and 454 LifeSciences (for high-throughput sequencing). In the array-based method(e.g., 454 Life Sciences), single-stranded nucleic acid (such as DNA) isannealed to beads and amplified via EmPCR. These nucleic acid-boundbeads are then placed into wells on a fiber-optic chip along withenzymes which produce light in the presence of ATP. When freenucleotides are washed over this chip, light is produced as ATP isgenerated when nucleotides join with their complementary base pairs.Addition of one (or more) nucleotide(s) results in a reaction thatgenerates a light signal that is recorded, for example by a CCD cameraThe signal strength is proportional to the number of nucleotides, forexample, homopolymer stretches, incorporated in a single nucleotideflow.

In another example, the NPPF amplicons are sequenced using a Illumina®(e.g., HiSeq) or Ion Torrent®, 454®, Helicos, PacBio®, Solid® (AppliedVioasystems) or any number of other commercial sequencing systems.Sequencing adapters (such as a poly-A or poly T tails present on theNPPF amplicons, for example introduced using PCR) are used for capture.Sequencing by 454® or Illumina® typically involves library preparation,accomplished by random fragmentation of DNA, followed by in vitroligation of common adaptor sequences. For the disclosed methods, thestep of random fragmentation of the nucleic acid to be sequenced can beeliminated, and the in vitro ligation of adaptor sequences can be to theNPPF amplicons, such as an experiment tag present in the NPPF amplicons,though these can also be incorporated by use of the flanking regions andPCR, avoiding the need for ligation. Once captured through sequencingadaptors to the sequencing chip/bead, bridge amplification is performedto form colonies of each probe for sequencing. In these methods, theNPPF amplicons end up spatially clustered, either to a single locationon a planar substrate (Illumina®, in situ colonies, bridge PCR), or tothe surface of micron-scale beads (454®, emulsion PCR), which can berecovered and arrayed (emulsion PCR). The sequencing method includesalternating cycles of enzyme-driven biochemistry and imaging-based dataacquisition. These platforms rely on sequencing by synthesis, that is,serial extension of primed templates. Successive iterations of enzymaticinterrogation and imaging are used to build up a contiguous sequencingread for each array feature. Data are acquired by imaging of the fullarray at each cycle (e.g., of fluorescently labeled nucleotidesincorporated by a polymerase). each cycle (e.g., of fluorescentlylabeled nucleotides incorporated by a polymerase). More than onesequencing primer can be used on the colonies formed on the flow cell,permitting either dual-end sequencing, or sequencing of one or moreother portions of the amplicon, such as a barcode or index tag, orexperimental tag.

For 454®, a sequencing primer is hybridized to the NPPF afteramplification on the sequencing chip/bead amplicon. Sequencing isperformed by pyrosequencing. Amplicon-bearing beads are pre-incubatedwith Bacillus stearothermophilus (Bst) polymerase and single-strandedbinding protein, and then deposited on to a microfabricated array ofpicoliterscale wells, one bead per well, rendering this biochemistrycompatible with array-based sequencing. Smaller beads are also added,bearing immobilized enzymes also required for pyrosequencing (ATPsulfurylase and luciferase). During the sequencing, one side of thesemi-ordered array functions as a flow cell for introducing and removingsequencing reagents. The other side is bonded to a fiber-optic bundlefor CCD-based signal detection. At each cycle, a single species ofunlabeled nucleotide is introduced. For sequences where thisintroduction results in incorporation, pyrophosphate is released via ATPsulfurylase and luciferase, generating a burst of light detected by theCCD for specific array coordinates. Across multiple cycles, the patternof detected incorporation events reveals the sequence of templatesrepresented by individual beads.

For methods that use bridge PCR (e.g., Illumina®), amplified sequencingfeatures are generated by bridge PCR. Both forward and reverse PCRprimers are tethered to a solid substrate by a flexible linker, suchthat all amplicons arising from any single template molecule during theamplification remain immobilized and clustered to a single physicallocation on an array. In some examples, bridge PCR uses alternatingcycles of extension with Bst polymerase and denaturation (e.g., withformamide). The resulting ‘clusters’ each consist of about 1,000 clonalamplicons. Several million clusters can be amplified to distinguishablelocations within each of eight independent ‘lanes’ that are on a singleflow-cell (such that eight independent experiments can be sequenced inparallel during the same instrument run). After cluster generation, theamplicons are linearization and a sequencing primer is hybridized to auniversal adaptor sequence flanking the region of interest. Each cycleof sequence interrogation consists of single-base extension with amodified DNA polymerase and a mixture of four nucleotides. Thesenucleotides are ‘reversible terminators’, in that a chemically cleavablemoiety at the 3′ hydroxyl position allows only a single-baseincorporation to occur in each cycle, and one of four fluorescentlabels, also chemically cleavable, corresponds to the identity of eachnucleotide. After single-base extension and acquisition of images infour channels, chemical cleavage of both groups sets up for the nextcycle. Read-lengths up to 36 by are currently routinely performed.

In one example, the Helicos® or PacBio® single molecule sequencingmethod is used.

It will be appreciated that the NPPF can be designed for sequencing byany method, on any sequencer developed currently or in the future. TheNPPF itself does not limit the method of sequencing used, nor the enzymeused. Other methods of sequencing are or will be developed, and oneskilled in the art can appreciate that the generated NPPF amplicons (orDNA hybridized to the NPPF) will be suitable for sequencing on thesesystems.

G. Controls

In some embodiments, the method includes the use of one or more positiveand/or negative controls subject to the same reaction conditions as theactual experimental NPPFs. The use of tagging permits actual differentsamples to be used as controls but processed for sequencing and run inthe same sequencing lane as test samples. DNA can be measured as acontrol for the number of cells when measuring target RNA.

In some examples, a “positive control” includes an internalnormalization control for variables such as the number of cells lysedfor each sample, the recovery of DNA or RNA, or the hybridizationefficiency, such as one or more NPPFs, CFSs, corresponding linkers, andthe like, which are specific for one or more basal level or constitutivehousekeeping genes, such as structural genes (e.g., actin, tubulin, orothers) or DNA binding proteins (e.g., transcription regulation factors,or others). In some examples, a positive control includesglyceraldehyde-3-phosphate dehydrogenase (GAPDH), peptidylproylylisomerase A (PPIA), large ribosomal protein (RPLP0), ribosomal proteinL19 (RPL19), or other housekeeping genes discussed below. Other positivecontrols can be spiked into the sample to control for the assay process,independent of sample.

In other examples, a positive control includes an NPPF specific for anDNA or RNA that is known to be present in the sample (for example anucleic acid sequence likely to be present in the species being tested,such as a housekeeping gene). For example, the corresponding positivecontrol NPPF can be added to the sample prior to or during hybridizationwith the plurality of test NPPFs. Alternatively, the positive controlNPPF is added to the sample after nuclease treatment.

In some examples, a positive control includes an nucleic acid moleculeknown to be present in the sample (for example a nucleic acid sequencelikely to be present in the species being tested, such as a housekeepinggene). The corresponding positive control nucleic acid molecule (such asin vitro transcribed nucleic acid or nucleic acid isolated from anunrelated sample) can be added to the sample prior to or duringhybridization with the plurality of NPPFs.

In some examples, a “negative control” includes one or more NPPFs, CFSs,corresponding linkers, or the like, whose complement is known not to bepresent in the sample, for example as a control for hybridizationspecificity, such as a nucleic acid sequence from a species other thanthat being tested, e.g., a plant nucleic acid sequence when humannucleic acids are being analyzed (for example, Arabidopsis thalianaAP2-like ethylene-responsive transcription factor (ANT)), or a nucleicacid sequence not found in nature.

In some embodiments, the signal from each NPPF amplicon is normalized tothe signal of at least one housekeeping nucleic acid molecule, forexample to account for differences in cellularity between samples.Exemplary housekeeping genes include one or more of GAPDH(glyceraldehyde 3-phosphate dehydrogenase), SDHA (succinatedehydrogenase), HPRT1 (hypoxanthine phosphoribosyl transferase 1), HBS1L(HBS1-like protein), β-actin (ACTB), β-2 microglobulin (B2m), and AHSP(alpha hemoglobin stabilizing protein). One of skill in the art canselect additional housekeeping genes for use in normalizing signals inthe disclosed methods, including, but not limited to ribosomal proteinS13 (RPS13), ribosomal protein S20 (RPS20), ribosomal protein L27(RPL27), ribosomal protein L37 (RPL37), ribosomal protein 38 (RPL38),ornithine decarboxylase antizyme 1 (OAZ1), polymerase (RNA) II (DNAdirected) polypeptide A, 220 kDa (POLR2A), yes-associated protein 1(YAP1), esterase D (ESD), proteasome (prosome, macropain) 26-S subunit,ATPase, 1 (PSMC1), eukaryotic translation initiation factor 3, subunit A(EIF3A), or 18S rRNA (see, e.g., de Jonge et al., PLoS One 2:e898, 2007;Saviozzi et al., BMC Cancer 6:200, 2006; Kouadjo et al., BMC Genomics8:127, 2007; each of which is incorporated herein by reference). Thenormalized values can be directly compared between samples or assays(for example, between two different samples in a single assay or betweenthe same sample tested in two separate assays).

IV. Nuclease Protection Probes with Flanking Sequences (NPPFs)

The disclosed methods permit detection and/or sequencing of one or moretarget nucleic acid molecules, for example simultaneously orcontemporaneously. Based on the target nucleic acid, NPPFs can bedesigned for use in the disclosed methods using the criteria set forthherein in combination with the knowledge of one skilled in the art. Insome examples, the disclosed methods include generation of one or moreappropriate NPPFs for detection of particular target nucleic acidmolecules. The NPPF, under a variety of conditions (known or empiricallydetermined), specifically binds (or is capable of specifically binding)to a target nucleic acid or portion thereof, if such target is presentin the sample.

The NPPFs include a region that is complementary to a target nucleicacid molecule, such that for each particular target nucleic acidsequence, there is at least one NPPF in the reaction that is specificfor the target nucleic acid sequence. For example, if there are 2, 3, 4,5, 6, 7, 8, 9 or 10 different target nucleic acid sequences to bedetected or sequenced, the method will correspondingly use at least 2,3, 4, 5, 6, 7, 8, 9 or 10 different NPPFs (wherein each NPPF correspondsto a particular target). Thus in some examples, the methods use at leasttwo NPPFs, wherein each NPPF is specific for a different target nucleicacid molecule. However, one will appreciate that several different NPPFscan be generated to a particular target nucleic acid molecule, such asmany different regions of a single target nucleic acid sequence. In oneexample, an NPPF includes a region that is complementary to a sequencefound only in a single gene in the transcriptome. NPPFs are designed tobe specific for a target nucleic acid molecule and to have similar Tm's(if to be used in the same reaction).

Thus, a single sample may be contacted with one or more NPPFs. A set ofNPPFs is a collection of two or more NPPFs each specific for a differenttarget and/or a different portion of a same target. A set of NPPFs caninclude at least, up to, or exactly 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15,20, 25, 30, 50, 100, 500, 1000, 2000, 3000, 5000 or 10,000 differentNPPFs. In some examples, a sample is contacted with a sufficient amountof NPPF to be in excess of the target for such NPPF, such as a 100-fold,500-fold, 1000-fold, 10,000-fold, 100,000-fold or 10⁶-fold excess. Insome examples, if a set of NPPFs is used, each NPPF of the set can beprovided in excess to its respective target (or portion of a target) inthe sample. Excess NPPF can facilitate quantitation of the amount ofNPPF that binds a particular target. Some method embodiments involve aplurality of samples (e.g., at least, up to, or exactly 10, 25, 50, 75,100, 500, 1000, 2000, 3000, 5000 or 10,000 different samples)simultaneously or contemporaneously contacted with the same NPPF or setof NPPFs.

FIG. 1 shows an exemplary NPPF 100 having a region 102 that includes asequence that specifically binds to or hybridizes to the target nucleicacid sequence, as well as flanking sequences 104, 106 at the 5′- and3′-end of the NPPF, wherein the flanking sequences bind or hybridize totheir complementary sequences (referred to herein as CFSs). The NPPFs(as well as CFSs that bind to the NPPFs) can be composed of natural(such as ribonucleotides (RNA), or deoxyribonucleotides (DNA)) orunnatural nucleotides (such as locked nucleic acids (LNAs, see, e.g.,U.S. Pat. No. 6,794,499), peptide nucleic acids (PNAs)), and the like.The NPPFs can be single- or double-stranded. In some examples, the NPPFsinclude one or more synthetic bases or alternative bases (such asinosine). Modified nucleotides, unnatural nucleotides, synthetic, oralternative nucleotides can be used in NPPFs at one or more positions(such as 1, 2, 3, 4, 5, or more positions). In some examples, use of oneor more modified or unnatural nucleotides in the NPPF can increase theT_(m) of the NPPF relative to the T_(m) of a NPPF of the same length andcomposition which does not include the modified nucleic acid. One ofskill in the art can design probes including such modified nucleotidesto obtain a probe with a desired T_(m). In one example, an NPPF iscomposed of DNA or RNA, such as single stranded (ssDNA) or branched DNA(bDNA). In one example, an NPPF is an aptamer.

Methods of empirically determining the appropriate size of a NPPF foruse with particular targets or samples (such as fixed or crosslinkedsamples) are routine. In specific embodiments, a NPPF can be up to 500nucleotides in length, such as up to 400, up to 250, up to 100, or up to75 nucleotides in length, including, for example, in the range of20-500, 20-250, 25-200, 25-100, 25-75, or 25-50 nucleotides in length.In one non-limiting example, an NPPF is at least 35 nucleotides inlength, such as at least 40, at least 45, at least 50, at least 75, atleast 100, at least 150, or at least 200 nucleotides in length, such as50 to 200, 50 to 100 or 75 to 200, or 36, 72, or 100 nucleotides inlength. Particular NPPF embodiments may be longer or shorter dependingon desired functionality. In some examples, the NPPF is appropriatelysized (e.g., sufficiently small) to penetrate fixed and/or crosslinkedsamples. Fixed or crosslinked samples may vary in the degree of fixationor crosslinking; thus, an ordinarily skilled artisan may determine anappropriate NPPF size for a particular sample condition or type, forexample, by running a series of experiments using samples with known,fixed target concentration(s) and comparing NPPF size to target signalintensity. As NPPF length increases, in such an experiment, at somepoint target signal intensity should begin to decrease. In someexamples, the sample (and, therefore, at least a proportion of target)is fixed or crosslinked, and the NPPF is sufficiently small that signalintensity remains high and does not substantially vary as a functionNPPF size.

The sequence 102 that specifically binds to the target nucleic acidsequence is complementary in sequence to the target nucleic acidsequence to be detected or sequenced. One skilled in the art willappreciate that the sequence 102 need not be complementary to an entiretarget nucleic acid (e.g., if the target is a gene of 100,000nucleotides, the sequence 102 can be a portion of that, such as at least10, at least 15, at least 20, at least 25, at least 30, at least 40, atleast 50, at least 100, or more consecutive nucleotides complementary toa particular target nucleic acid molecule). The specificity of a probeincreases with length. Thus for example, a sequence 102 thatspecifically binds to the target nucleic acid sequence which includes 25consecutive nucleotides will anneal to a target sequence with a higherspecificity than a corresponding sequence of only 15 nucleotides. Thus,the NPPFs disclosed herein can have a sequence 102 that specificallybinds to the target nucleic acid sequence which includes at least 10, atleast 15, at least 20, at least 25, at least 30, at least 40, at least50, at least 100, or more consecutive nucleotides complementary to aparticular target nucleic acid molecule (such as about 6 to 50, 10 to40, 10 to 60, 15 to 30, 18 to 23, 19 to 22, or 20 to 25 consecutivenucleotides complementary to a target DNA or a target RNA). Particularlengths of sequence 102 that specifically binds to the target nucleicacid sequence that can be part of the NPPFs used to practice the methodsof the present disclosure include 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50contiguous nucleotides complementary to a target nucleic acid molecule.In some examples where the target nucleic acid molecule is an miRNA (orsiRNA), the length of the sequence 102 that specifically binds to thetarget nucleic acid sequence can be shorter, such as 20-30 nucleotidesin length (such as 20, 21, 22, 23, 24, 25, 26, 27, 28 29, or 30nucleotides) to match the miRNA (or siRNA) length. However, one skilledin the art will appreciate that the sequence 102 that specifically bindsto the target need not be 100% complementary to the target nucleic acidmolecule. Depending on the reaction conditions and the correspondingselectivity of the nuclease used, more than one mismatch may be required(such as at least two adjacent mismatches) for nuclease digestion tooccur. In some examples, the NPPF is degenerate at one or more positions(such as 1, 2, 3, 4, 5, or more positions), for example, a mixture ofnucleotides (such as 2, 3, or 4 nucleotides) at a specified position inthe sequence 104 that specifically binds to the target.

The sequence 102 also specifically binds to a programming orbifunctional linker (wherein a region of the bifunctional linker iscomplementary to sequence 102). In some embodiments, followinghybridization and nuclease treatment, the sample is contacted with asurface (such as one that includes multiple spatially discrete regions),including at least capture molecule, such as an anchor associated with aprogramming linker or a nucleic acid capture probe that includes asequence complementary to a portion of the NPPF amplicon (such as aflanking sequence or portion thereof). As shown in FIG. 3, thebifunctional linker 216 includes a first portion which specificallybinds to (for example, is complementary to) the anchor 214 and a secondportion which specifically binds to (for example, is complementary to) aregion of the NPPF amplicon 210. In some examples, the NPPF amplicon hasat least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 contiguous nucleotidescomplementary to the bifunctional linker.

The sequence of the flanking sequence 104, 106 can provide a universalamplification point that is complementary to at least a portion of anamplification primer. The flanking sequence thus permits multiplexing,as the same amplification primers can be used to amplify NPPFs specificfor different target nucleic acid molecules. The flanking sequence isnot similar to a sequence found in the target genome. For example, ifthe target nucleic acid is a human sequence, the sequence of theflanking sequence is not similar to a sequence found in the targetgenome. This helps to reduce binding of non-target sequences that may bepresent in the target genome from binding to the NPPFs. Methods ofanalyzing a sequence for its similarity to a genome are well known inthe art.

The flanking sequence 104, 106 can also be used to permit capture of anNPPF amplicon, for example capture to a substrate. For example, an NPPFcontaining a flanking sequence that includes a sequence complementary toa nucleic acid capture probe present on a surface (such as directlyconjugated to a surface), can hybridize to the nucleic acid captureprobe permitting capture or binding of the NPPF amplicon to the surface.Thus, in some examples, the flanking sequence includes (or permitsaddition of, for example during amplification) of an experimental tag,such as one that permits capture of the NPPF amplicon. One willappreciate that other experimental tags can be used, such as those usedto uniquely identify an NPPF or populations of NPPFs, and that suchexperimental tags can be part of the NPPF, or can be added later, forexample by using a primer complementary to the flanking sequence andwhich also includes a sequence complementary to the tag to be added tothe resulting amplicon. The flanking sequences also permit labeling ofthe NPPF, for example during amplification of the NPPF, or by using alabeled probe that is complementary to the flanking sequence, andallowing the probe to bind to the NPPF. In some examples, the flankingsequence includes (or permits addition of, for example duringamplification) of a sequencing adapater, such as a poly-A or poly-Tsequence needed for some sequencing platforms.

One will appreciate than an NPPF can include one or two flankingsequences (e.g., one at the 5′-end, one at the 3′-end, or both), andthat the flanking sequences can be the same or different. As illustratedin FIGS. 6A and B, the NPPF can include a single flanking sequence.FIGS. 6A and 6B show the flanking sequence at the 5′-end, but one willappreciate it can also be at the 3′-end instead. FIG. 6A shows anexample where all of the NPPFs in the reaction have the same flankingsequence F1. Amplification with an F1-specific primer (such as a labeledprimer) could be used to add the same 5′- or 3′-tag (e.g., sequencingadaptor or experimental tag) to each NPPF. For example, the samesequencing adapter could be added to all of the NPPFs, permittingsequencing of the NPPFs in the same sequencing platform. FIG. 6B showsan example where each NPPF (or each subpopulation of NPPFs) in thereaction have a different flanking sequence, F1 to F3. For example, F1,F2, and F3 could be complementary to a capture nucleic acid probe 1, 2,and 3, respectively on a surface. This eliminates the need forbifunctional linkers (e.g., see bottom of FIG. 3). In another example,amplification with T1-F1-, T2-F2-, and T3-F3-specific primers can beused to add a different experimental tag to each different NPPF (orpopulations of NPPFs).

As illustrated in FIGS. 6C-6F, the NPPF can in some examples include twoflanking sequences, one at the 5′-end the other at the 3′-end of theNPPF. FIG. 6C shows an example where all of the NPPFs in the reactionhave the same flanking sequence, F1, at both ends. FIG. 6D shows anexample wherein all of the flanking sequences on the 5′-end are the same(e.g., F1), and all of the flanking sequences on the 3′-end are the same(e.g., F(a)), but the 5′-end and 3′-end flanking sequences differ. Insuch an example, this permits the inclusion of for example of the sameexperimental tag on one end of the NPPFs, and the inclusion of forexample of the same sequencing adaptor to the other side of the NPPFs.As there will be no primer hybridization bias each NPPF should be taggedwith the same fidelity. FIG. 6E shows an example wherein all of theflanking sequences on one end are the same (e.g., F1 on the 5′-end), butall of the flanking sequences on the other end differ from one another(e.g., F(a), F(b), and F(c)). In such an example, this permits the useof a single capture probe to capture all of the NPPFs (e.g., using acapture probe having at least a portion of its sequence complementary toF1). The flanking sequences on the other end, F(a), F(b) & F(c), couldbe used for example to differentially label each NPPF (such as usingdifferent experiment tags). Alternatively, F(a), F(b) & F(c) could becomplementary to capture probes 1, 2, and 3, respectively, and F1 couldbe used a to label all of the NPPFs in the same way. FIG. 6F shows anexample wherein all of the flanking sequences are different,irrespective of their position (e.g., F(a), F(b), F(c), F1, F2, and F3).In this example, each flanking sequence can be used for a differentexperiment tag or for combinations of different experiment tags anddifferent sequencing adapters.

Thus, an NPPF sequence can be represented by 1-2-3 where 1 and 3 areflanking sequences on either side of sequence 2 (which is complementaryto the target nucleic acid). Each of these regions can hybridized atsome point in the method to its complementary sequence. For example, Acan be complementary to flanking sequence 1 of the NPPF (e.g., A can bea CFS complementary to sequence 1), B can be complementary to sequence 2of the NPPF (e.g., a target sequence complementary to sequence 2), and Ccan be complementary to the flanking sequence 3 of the NPPF (e.g., C canbe a CFS complementary to sequence 3). This is what occurs during thehybridization of the target nucleic acid molecules and CFSs, to theircorresponding NPPF. For example:

1-2-3

A-B-C

In some examples, the, experimental tags (such as those that distinguishexperiments or patients from one another) and sequencing adapters,represented by D and E respectively, are added using the flankingsequences, for example during amplification (such that the amplificationprimer is complementary to the flanking sequence and includes a sequencecomplementary to the tag or adapter to be added to the resulting NPPFamplicon). For example, amplification of the NPPF with such primerswould result in a sequence as follows: E-1-2-3-D or D-1-2-3-E.

The table below also shows five exemplary combinations of 5′-tags (suchas experimental tags or sequencing adpaters), 5′-flanking sequences,target-specific sequences, 3′-flanking sequences, and 3′-tags. The5′-tags and 3′-tags are added during amplification. The 5′-flankingsequences and 3′-flanking sequences are sequences that are part of theoriginal NPPF (and thus part of the flanking sequence itself).

5′-Flanking Target-specific 3′-Flanking 5′-Tag Sequence SequenceSequence 3′-Tag Ex. 1 None Sequencer Adapter Sequencer Adapter None Ex.2 Sequencing Sequence-specific Sequence-specific Sequencing Adapteridentifier identifier Adapter Ex. 3 Experimental tag Experimental tagExperimental tag Experimental tag (short sequence or (short sequence or(short sequence or (short sequence or modified bases, modified bases,modified bases, modified bases, identifer for one/ identifer for one/identifer for one/ identifer for one/ several reactions severalreactions several reactions several reactions to be independently to beindependently to be independently to be independently discerned: by(i.e.) discerned: by (i.e.) discerned: by (i.e.) discerned: by (i.e.)patient, sample, patient, sample, patient, sample, patient, sample, celltype, time cell type, time cell type, time cell type, time coursetimepoint, course timepoint, course timepoint, course timepoint,treatment) treatment) treatment) treatment) Ex. 4 Biotin or other Biotinor other Biotin or other Biotin or other detection (e.g., detection(e.g., detection (e.g., detection (e.g., hapten) tag/ hapten) tag/hapten) tag/ hapten) tag/ capture sequence capture sequence capturesequence capture sequence Ex. 5 Site for cleavage Site for cleavage Sitefor cleavage Site for cleavage (enzymatic/ (enzymatic/ (enzymatic/(enzymatic/ modified base) modified base) modified base) modified base)“Buffer” (e.g., “Buffer” (e.g., spacer or universal) spacer oruniversal) sequence sequence

In specific examples, each flanking sequence does not specifically bindto any other NPPF sequence (e.g., sequence 102 or other flankingsequence) or to any component of the sample. In some examples, if thereare two flanking sequences, the sequence of each flanking sequence 104,106 is different. Ideally, if there are two different flaking sequences(for example two different flaking sequences on the same NPPF and/or toflaking sequences of other NPPFs in a set of NPPFs), each flankingsequence 104, 106 has a similar melting temperature (T_(m)), such as aT_(m)+/−about 10° C. or +/−5° C. of one another, such as +/−4° C., 3°C., 2° C., or 1° C.

In particular examples, the flanking sequence 104, 106 is at least 12nucleotides in length, such as at least 15, at least 20, at least 25, atleast 30, at least 40, or at least 50 nucleotides in length, such as12-50 or 12-30 nucleotides, for example, 20, 21, 22, 23, 24, 25, 26, 27,28, 29 or 30 nucleotides in length, wherein the contiguous nucleotidesnot found in a nucleic acid molecule present in the sample to be tested.The flanking sequences are protected from degradation by the nuclease byhybridizing molecules to the flanking sequences which have a sequencecomplementary to the flanking sequences (CFSs).

Factors that affect NPPF-target and NPPF-CFS hybridization specificityinclude length of the NPPF and CFS, melting temperature,self-complementarity, and the presence of repetitive or non-uniquesequence. See, e.g., Sambrook et al., Molecular Cloning: A LaboratoryManual, 3d ed., Cold Spring Harbor Press, 2001; Ausubel et al., CurrentProtocols in Molecular Biology, Greene Publishing Associates, 1992 (andSupplements to 2000); Ausubel et al., Short Protocols in MolecularBiology: A Compendium of Methods from Current Protocols in MolecularBiology, 4th ed., Wiley & Sons, 1999. Conditions resulting in particulardegrees of hybridization (stringency) will vary depending upon thenature of the hybridization method and the composition and length of thehybridizing nucleic acid sequences. Generally, the temperature ofhybridization and the ionic strength (such as the Na⁺ concentration) ofthe hybridization buffer will determine the stringency of hybridization.In some examples, the NPPFs utilized in the disclosed methods have aT_(m) of at least about 37° C., at least about 42° C., at least about45° C., at least about 50° C., at least about 55° C., at least about 60°C., at least about 65° C., at least about 70° C., at least about 75° C.,at least about 80° C., such as about 42° C.-80° C. (for example, about37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,73, 74, 75, 76, 77, 78, 79, or 80° C.). In one non-limiting example, theNPPFs utilized in the disclosed methods have a T_(m) of about 42° C.Methods of calculating the T_(m) of a probe are known to one of skill inthe art (see e.g., Sambrook et al., Molecular Cloning: A LaboratoryManual, 3d ed., Cold Spring Harbor Press, 2001, Chapter 10). In someexamples, the NPPFs for a particular reaction are selected to each havethe same or a similar T_(m) in order to facilitate simultaneousdetection or sequencing of multiple target nucleic acid molecules in asample, such as T_(m)s+/−about 10° C. of one another, such as +/−10° C.,9° C., 8° C., 7° C., 6° C., 5° C., 4° C., 3° C., 2° C., or 1° C. of oneanother.

A. Flanking Sequences

One or both of the flanking sequences of the NPP (e.g., 104 or 106 ofFIG. 1) include a sequence that provides a universal amplificationpoint. Such a sequence is complimentary to at least a portion of anamplification primer. This allows the primer to hybridize to the NPPF,and amplify the NPPF. As flanking sequences can be identical betweenNPPFs specific for different target nucleic acid molecules, this permitsthe same primer to be used to amplify any number of different NPPFs. Forexample, an NPPF can include a 5′-flanking sequence, and a 3′-flankingsequence, wherein the 5′- and the 3′-flanking sequences are differentfrom one another, but are the same for a plurality of NPPFs fordifferent targets. Thus an amplification primer that includes a sequencecomplementary to the 5′-flanking sequence, and an amplification primerthat includes a sequence complementary to the 3′-flanking sequence, canboth be used in a single reaction to amplify multiple NPPFs, even if theNPPFs are specific for different target sequences.

In some examples, the flanking sequence does not include an experimenttag sequence and/or a sequencing adapter sequence. In some examples, aflanking sequence includes or consists of an experiment tag sequenceand/or sequencing adapter sequence. In other examples, the primers usedto amplify the NPPFs include an experiment tag sequence and/orsequencing adapter sequence, thus permitting incorporation of theexperiment tag and/or sequencing adapter into the NPPF amplicon duringamplification of the NPPF.

In one example, a flanking sequence is designed such that the sequenceforms a loop on itself. Thus, one region of a flanking sequence iscomplementary to a second region of the same flanking sequence, suchthat the first and second regions hybridize to one another, forming aloop or hairpin. This would eliminate the need for CFSs, as the secondregion would protect the first region during the nuclease step.

B. Primers that Bind the Flanking Sequences

The amplification primers that specifically bind or hybridize to theflanking sequences can be used to initiate amplification, such as PCRamplification. In addition, the amplification primers can be used tointroduce nucleic acid tags (such as experiment tags or sequencingadapters) and/or detectable labels to NPPFs. For example, in addition tothe amplification primer having a region complementary to the flankingsequence, it can also include a second region having a nucleic acidsequence that results in addition of an experiment tag, sequencingadapter, detectable label, or combinations thereof, to the resultingNPPF amplicon. An experiment tag or sequencing adapter can be introducedat the NPPF 5′- and/or 3′-end. In some examples, two or more experimenttags and/or sequencing adaptors are added to a single end or both endsof the NPPF amplicon, for example using a single primer having a nucleicacid sequence that results in addition of two or more experiment tagsand/or sequencing adapters. Experiment tags can be used, for example, todifferentiate one sample or sequence from another, or to permit captureof an NPPF amplicon by a substrate. Sequence tags permit capture of theresulting NPPF amplicon by a particular sequencing platform.

A detectable label can be introduced at any point of the NPPF, includingthe 5′- and/or 3′-end. In one example, the label is introduced to anNPPF amplicon by hybridization of a labeled probe complementary to theNPPF amplicon. In one example, the label is introduced to an NPPFamplicon by use of a labeled primer during amplification of the NPPF,thereby generating a labeled NPPF amplicon. Detectable labels permitdetection of the NPPF amplicons.

In some examples, such primers are at least 12 nucleotides in length,such as at least 15, at least 20, at least 30, at least 40 or at least50 nucleotides (for example 25 nucleotides). In some examples theprimers include a detectable label (and such primers can be referred toas probes), such as biotin, that gets incorporated into the NPPFamplicons.

C. Addition of Experiment Tags

Experimental tags can be part of the NPPF when generated (for example bepart of the flanking sequence). In another example, the experiment tagis added later, for example during amplification of the NPPF, resultingin an NPPF amplicon containing an experimental tag. The presence of theuniversal flanking sequences on the NPPF permit the use of universalprimers, which can introduce other sequences onto the NPPFs, for exampleduring amplification.

Experiment tags, such as one that differentiates one sample fromanother, can be used to identify the particular target sequenceassociated with the NPPF, or permit capture of an NPPF amplicon by asubstrate (wherein the experiment tag is complementary to a captureprobe on the substrate, permitting hybridization between the two). Inone example, the experiment tag is the first three, five, ten, twenty,or thirty nucleotides of the 5′- and/or 3′-end of the NPPF or NPPFamplicon.

In one example an experiment tag is used to differentiate one samplefrom another. For example, such a sequence can function as a barcode, toallow one to correlate a particular sequence detected with a particularsample, patient, or experiment (such as a particular reaction well, dayor set of reaction conditions). This permits a particular NPPF that isdetected or sequenced to be associated with a particular patient orsample or experiment for instance. The use of such tags provides a wayto lower cost per sample and increase sample throughput, as multipleNPPF amplicons can be tagged and then combined (for example fromdifferent experiments or patients), for example in a single sequencingrun or detection array. This allows for the ability to combine differentexperimental or patient samples into a single run, within the sameinstrument channel. For example, such tags permitting 100's or 1,000'sof different experiments to be sequenced in a single run, within asingle channel. For example, pooling 100 samples per channel, 8,000samples can be tested in a single run of an 8-channel sequencer. Inaddition, if the method includes the step of gel purifying the completedamplification reaction (or other method of purification or clean up thatdoes not require actual separation) only one gel (or clean up orpurification reaction or process) is needed to be run per detection orsequencing run. The sequenced NPPF amplicons can then be sorted, forexample by the experiment tags.

In one example the experiment tag is used to identify the particulartarget sequence associated with the NPPF. In this case, using anexperimental tag to correspond to a particular target sequence canshorten the time or amount of sequencing needed, as sequencing the endof the NPPF instead of the entire NPPF can be sufficient. For example,if such an experiment tag is present on the 3′-end of the NPPF amplicon,the entire NPPF amplicon sequence itself does not have to be sequencedto identify the target sequence which hybridized to the NPPF. Instead,only the 3′-end of the NPPF amplicon containing the experiment tag needsto be sequenced. This can significantly reduce sequencing time andresources, as less material needs to be sequenced.

In one example the experiment tag is used to permit capture NPPFs, suchas to concentrate NPPFs or NPPF amplicons from a sample. For example,the experiment tag can have a sequence that is complementary to thesequence of at least a portion of a capture probe on a substratesurface, thereby permitting hybridization of the NPPF to the captureprobe. For instance, following amplification, NPPF amplicons containingan experimental tag (such as a population of NPPF amplicons containingthe same experimental tag) can be isolated from other materials byincubating the sample with a substrate (such as magnetic beads)containing a plurality of capture probes with sequences complementary tothe experimental tag. After their capture, the NPPF amplicons can bedetected or sequenced, or can be released from the substrate for furtheranalysis. In one example, the substrate is magnetic beads, the PCRreaction containing NPPF amplicons is incubated with the beads. Thebeads are then held in a magnetic field while the sample solution(containing non-desired nucleic acid molecules and other materials) isremoved. The captured NPPFs can be eluted into a smaller volume byreversing hybridization, such as by addition of base and heating. Onewill appreciate that similar methods can be used with other NPPFs andother substrates (such as by using a solid substrate and a flow throughdevice), resulting in the captured NPPFs being eluted into a smallervolume. If a haptan is added during amplification, it can be used forcapture. One advantage of such a method is that the NPPFs or NPPFamplicons can be isolated from a large sample, such as 1 ml plasma, andeluted into a smaller volume used for assays, such as 20 μl.

Experimental tags can also be used for amplification, such as nestedamplification, or two stage amplification.

In particular examples, the experiment tag is at least 3 nucleotides inlength, such as at least 5, at least 10, at least 15, at least 20, atleast 25, at least 30, at least 40, or at least 50 nucleotides inlength, such as 3-50, 3-20, 12-50 or 12-30 nucleotides, for example, 3,5, 10, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides inlength.

D. Addition of Sequencing Adapters

Sequencing adapters can be part of the NPPF when generated (for examplebe part of the flanking sequence). In another example, the sequencingadapter is added later, for example during amplification of the NPPF,resulting in an NPPF amplicon containing a sequencing adapter. Thepresence of the universal flanking sequences on the NPPF permit the useof universal primers, which can introduce other sequences onto theNPPFs, for example during amplification.

A sequencing adapter can be used add a sequence to an NPPF ampilconneeded for a particular sequencing platform. For example, somesequencing platforms (such as the 454 and Illumina platforms) requirethe nucleic acid molecule to be sequenced to include a particularsequence at its 5′- and/or 3′-end, for example to capture the moleculeto be sequenced. For example, the appropriate sequencing adapter isrecognized by a complementary sequence on the sequencing chip or beads,and the NPPF captured by the presence of the sequencing adapter.

In one example, a poly-A (or poly-T), such as a poly-A or poly-T atleast 10 nucleotides in length is added to the NPPF during PCRamplification. In a specific example, the poly-A (or poly-T) is added tothe 3′-end of the NPPF. In some examples, this added sequence ispoly-adenylated at its 3′ end using a terminal deoxynucleotidyltransferase (Tdt).

In particular examples, the sequencing tag added is at least 12nucleotides (nt) in length, such as at least 15, at least 20, at least25, at least 30, at least 40, or at least 50 nt in length, such as 12-50or 12-30 nt, for example, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30nt in length.

E. Detectable Labels

In some examples, the disclosed NPPFs, PCR primers, or both, include oneor more detectable labels. Detectable labels are well known in the art.A detectable label is a molecule or material that can be used to producea detectable signal that indicates the presence or concentration of anNPPF or NPPF amplicon (e.g., the bound or hybridized probe) in a sample.Thus, a labeled NPPF provides an indicator of the presence orconcentration of a target nucleic acid sequence (e.g., a target DNA or atarget RNA) in a sample. The disclosure is not limited to the use ofparticular labels, although examples are provided.

In some examples, the label is incorporated into the NPPF duringsynthesis of the NPPF. In some examples, the label is incorporated intothe NPPF during amplification, for example using labeled primers (thusgenerating labeled NPPF amplicons). In yet other examples, the NPPF islabeled by using a labeled probe that is complementary to, and thushybridizes to, a portion of the NPPF (such as an NPPF amplicon), such asa flanking region of the NPPF.

In some examples, each of the NPPFs included in a plurality of NPPFsutilized in the disclosed methods are labeled with the same detectablelabel. In other examples at least one NPPF is labeled with a differentdetectable label than at least one other NPPF in the plurality of NPPs.For example, at least one NPPF included in the plurality of NPPFs can belabeled with a fluorophore (such as Cy-3™) and at least one NPPFincluded in the plurality of NPPs can be labeled with a differentfluorophore (such as Cy-5™). In some examples, the plurality of NPPFscan include at least 2, 3, 4, 5, 6, or more different detectable labels.Similarly, amplification primers used in the methods provided herein canbe labeled with the same or different detectable labels.

A label associated with one or more nucleic acid molecules (such as anNPPF or amplification primer) can be detected either directly orindirectly. A label can be detected by any known or yet to be discoveredmechanism including absorption, emission and/or scattering of a photon(including radio frequency, microwave frequency, infrared frequency,visible frequency and ultra-violet frequency photons). Detectable labelsinclude colored, fluorescent, electroluminescent, phosphorescent andluminescent molecules and materials, catalysts (such as enzymes) thatconvert one substance into another substance to provide a detectabledifference (such as by converting a colorless substance into a coloredsubstance or vice versa, or by producing a precipitate or increasingsample turbidity), haptens, and paramagnetic and magnetic molecules ormaterials. Additional detectable labels include Raman (light scattering)labels (e.g., Nanoplex® biotags, Oxonica, Bucks, UK). Other exemplarydetectable labels include digoxin, the use of energy transfer and energyquenching pairs (such as FRET), IR, and absorbance/colorimetric labels.

In non-limiting examples, NPPFs or primers are labeled with dNTPscovalently attached to hapten molecules (such as a nitro-aromaticcompound (e.g., dinitrophenyl (DNP)), biotin, fluorescein, digoxigenin,etc.). Methods for conjugating haptens and other labels to dNTPs (e.g.,to facilitate incorporation into labeled probes) are well known in theart. For examples of procedures, see, e.g., U.S. Pat. Nos. 5,258,507,4,772,691, 5,328,824, and 4,711,955. A label can be directly orindirectly attached to a dNTP at any location on the dNTP, such as aphosphate (e.g., α, β or γ phosphate) or a sugar. In some examples,detection of labeled nucleic acid molecules can be accomplished bycontacting the hapten-labeled NPP with a primary anti-hapten antibody.In one example, the primary anti-hapten antibody (such as a mouseanti-hapten antibody) is directly labeled with an enzyme. In anotherexample, a secondary anti-antibody (such as a goat anti-mouse IgGantibody) conjugated to an enzyme is used for signal amplification. Inother examples, the hapten is biotin and is detected by contacting thehapten-labeled NPPF with avidin or streptavidin conjugated to an enzyme,such as horseradish peroxidase (HRP) or alkaline phosphatase (AP).

Additional examples of detectable labels include fluorescent molecules(or fluorochromes). Numerous fluorochromes are known to those of skillin the art, and can be selected, for example from Life Technologies(formerly Invitrogen), e.g., see, The Handbook—A Guide to FluorescentProbes and Labeling Technologies). Examples of particular fluorophoresthat can be attached (for example, chemically conjugated) to a nucleicacid molecule (such as an NPPF) are provided in U.S. Pat. No. 5,866,366to Nazarenko et al., such as 4-acetamido-4′-isothiocyanatostilbene-2,2′disulfonic acid, acridine and derivatives such as acridine and acridineisothiocyanate, 5-(2′-aminoethyl)aminonaphthalene-1-sulfonic acid(EDANS), 4-amino-N-[3-vinylsulfonyl)phenyl]naphthalimide-3,5 disulfonate(Lucifer Yellow VS), N-(4-anilino-1-naphthyl)maleimide, anthranilamide,Brilliant Yellow, coumarin and derivatives such as coumarin,7-amino-4-methylcoumarin (AMC, Coumarin 120),7-amino-4-trifluoromethylcouluarin (Coumarin 151); cyanosine;4′,6-diaminidino-2-phenylindole (DAPI);5′,5″-dibromopyrogallol-sulfonephthalein (Bromopyrogallol Red);7-diethylamino-3-(4′-isothiocyanatophenyl)-4-methylcoumarin;diethylenetriamine pentaacetate;4,4′-diisothiocyanatodihydro-stilbene-2,2′-disulfonic acid;4,4′-diisothiocyanatostilbene-2,2′-disulfonic acid;5-[dimethylamino]naphthalene-1-sulfonyl chloride (DNS, dansyl chloride);4-(4′-dimethylaminophenylazo)benzoic acid (DABCYL);4-dimethylaminophenylazophenyl-4′-isothiocyanate (DABITC); eosin andderivatives such as eosin and eosin isothiocyanate; erythrosin andderivatives such as erythrosin B and erythrosin isothiocyanate;ethidium; fluorescein and derivatives such as 5-carboxyfluorescein(FAM), 5-(4,6-dichlorotriazin-2-yl)aminofluorescein (DTAF),2′7′-dimethoxy-4′5′-dichloro-6-carboxyfluorescein (JOE), fluorescein,fluorescein isothiocyanate (FITC), and QFITC (XRITC);2′,7′-difluorofluorescein (OREGON GREEN®); fluorescamine; IR144; IR1446;Malachite Green isothiocyanate; 4-methylumbelliferone; orthocresolphthalein; nitrotyrosine; pararosaniline; Phenol Red;B-phycoerythrin; o-phthaldialdehyde; pyrene and derivatives such aspyrene, pyrene butyrate and succinimidyl 1-pyrene butyrate; Reactive Red4 (Cibacron Brilliant Red 3B-A); rhodamine and derivatives such as6-carboxy-X-rhodamine (ROX), 6-carboxyrhodamine (R6G), lissaminerhodamine B sulfonyl chloride, rhodamine (Rhod), rhodamine B, rhodamine123, rhodamine X isothiocyanate, rhodamine green, sulforhodamine B,sulforhodamine 101 and sulfonyl chloride derivative of sulforhodamine101 (Texas Red); N,N,N′,N′-tetramethyl-6-carboxyrhodamine (TAMRA);tetramethyl rhodamine; tetramethyl rhodamine isothiocyanate (TRITC);riboflavin; rosolic acid and terbium chelate derivatives.

Other suitable fluorophores include thiol-reactive europium chelateswhich emit at approximately 617 nm (Heyduk and Heyduk, Analyt. Biochem.248:216-27, 1997; J. Biol. Chem. 274:3315-22, 1999), as well as GFP,Lissamine™, diethylaminocoumarin, fluorescein chlorotriazinyl,naphthofluorescein, 4,7-dichlororhodamine and xanthene (as described inU.S. Pat. No. 5,800,996 to Lee et al.) and derivatives thereof. Otherfluorophores known to those skilled in the art can also be used, forexample those available from Life Technologies (Invitrogen; MolecularProbes (Eugene, Oreg.)) and including the ALEXA FLUOR® series of dyes(for example, as described in U.S. Pat. Nos. 5,696,157, 6,130,101 and 6,716,979), the BODIPY series of dyes (dipyrrometheneboron difluoridedyes, for example as described in U.S. Pat. Nos. 4,774,339, 5,187,288,5,248,782, 5,274,113, 5,338,854, 5,451,663 and 5,433,896), Cascade Blue(an amine reactive derivative of the sulfonated pyrene described in U.S.Pat. No. 5,132,432) and Marina Blue (U.S. Pat. No. 5,830,912).

In addition to the fluorochromes described above, a fluorescent labelcan be a fluorescent nanoparticle, such as a semiconductor nanocrystal,e.g., a QUANTUM DOT™ (obtained, for example, from Life Technologies(QuantumDot Corp, Invitrogen Nanocrystal Technologies, Eugene, Oreg.);see also, U.S. Pat. Nos. 6,815,064; 6,682,596; and 6,649,138).Semiconductor nanocrystals are microscopic particles havingsize-dependent optical and/or electrical properties. When semiconductornanocrystals are illuminated with a primary energy source, a secondaryemission of energy occurs of a frequency that corresponds to the bandgapof the semiconductor material used in the semiconductor nanocrystal.This emission can be detected as colored light of a specific wavelengthor fluorescence. Semiconductor nanocrystals with different spectralcharacteristics are described in e.g., U.S. Pat. No. 6,602,671.Semiconductor nanocrystals that can be coupled to a variety ofbiological molecules (including dNTPs and/or nucleic acids) orsubstrates by techniques described in, for example, Bruchez et al.,Science 281:2013-2016, 1998; Chan et al., Science 281:2016-2018, 1998;and U.S. Pat. No. 6,274,323.

Formation of semiconductor nanocrystals of various compositions aredisclosed in, e.g., U.S. Pat. Nos. 6,927,069; 6,914,256; 6,855,202;6,709,929; 6,689,338; 6,500,622; 6,306,736; 6,225,198; 6,207,392;6,114,038; 6,048,616; 5,990,479; 5,690,807; 5,571,018; 5,505,928;5,262,357 and in U.S. Patent Publication No. 2003/0165951 as well as PCTPublication No. 99/26299 (published May 27, 1999). Separate populationsof semiconductor nanocrystals can be produced that are identifiablebased on their different spectral characteristics. For example,semiconductor nanocrystals can be produced that emit light of differentcolors based on their composition, size or size and composition. Forexample, quantum dots that emit light at different wavelengths based onsize (565 nm, 655 nm, 705 nm, or 800 nm emission wavelengths), which aresuitable as fluorescent labels in the probes disclosed herein areavailable from Life Technologies (Carlsbad, Calif.).

Additional labels include, for example, radioisotopes (such as ³H),metal chelates such as DOTA and DPTA chelates of radioactive orparamagnetic metal ions like Gd³⁺, and liposomes.

Detectable labels that can be used with nucleic acid molecules (such asan NPPF or amplification primer) also include enzymes, for example HRP,AP, acid phosphatase, glucose oxidase, β-galactosidase, β-glucuronidase,or β-lactamase. Where the detectable label includes an enzyme, achromogen, fluorogenic compound, or luminogenic compound can be used incombination with the enzyme to generate a detectable signal (numerous ofsuch compounds are commercially available, for example, from LifeTechnologies, Carlsbad, Calif.). Particular examples of chromogeniccompounds include diaminobenzidine (DAB), 4-nitrophenylphosphate (pNPP),fast red, fast blue, bromochloroindolyl phosphate (BCIP), nitro bluetetrazolium (NBT), BCIP/NBT, AP Orange, AP blue, tetramethylbenzidine(TMB), 2,2′-azino-di-[3-ethylbenzothiazoline sulphonate] (ABTS),o-dianisidine, 4-chloronaphthol (4-CN),nitrophenyl-β-D-galactopyranoside (ONPG), o-phenylenediamine (OPD),5-bromo-4-chloro-3-indolyl-β-galactopyranoside (X-Gal),methylumbelliferyl-β-D-galactopyranoside (MU-Gal),p-nitrophenyl-α-D-galactopyranoside (PNP),5-bromo-4-chloro-3-indolyl-β-D-glucuronide (X-Gluc), 3-amino-9-ethylcarbazol (AEC), fuchsin, iodonitrotetrazolium (INT), tetrazolium blueand tetrazolium violet.

Alternatively, an enzyme can be used in a metallographic detectionscheme. Metallographic detection methods include using an enzyme, suchas alkaline phosphatase, in combination with a water-soluble metal ionand a redox-inactive substrate of the enzyme. The substrate is convertedto a redox-active agent by the enzyme, and the redox-active agentreduces the metal ion, causing it to form a detectable precipitate.(See, for example, U.S. Patent Application Publication No. 2005/0100976,PCT Publication No. 2005/003777 and U.S. Patent Application PublicationNo. 2004/0265922). Metallographic detection methods also include usingan oxido-reductase enzyme (such as horseradish peroxidase) along with awater soluble metal ion, an oxidizing agent and a reducing agent, againto form a detectable precipitate. (See, for example, U.S. Pat. No.6,670,113).

In some embodiments, the detectable label is attached to or incorporatedin the NPPF or primer at the 5′ end or the 3′ end (e.g., the NPPF orprimer is an end-labeled probe). In other examples the detectable labelis incorporated in the NPPF or primer at an internal position, such as1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more bases fromthe 5′ end of the NPPF or primer, or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, or more bases from the 3′ end of the NPPF or primer.

In one example, one of the flanking regions of the NPPF contains anacceptor or emitter (such as an acceptor fluorophore), while theamplification primer complementary to the flanking region contains theconverse (such as a donor fluorophore). Thus the primer-NPPF duplexemits detectable signal, but single stranded primers, or single strandedNPPFs, do not. The appearance of signal is a measure of the amount ofNPPF in the sample, and can be measured without separation of thelabeled excess primers from the amplified adducts. Examples of FRETacceptor-donor pairs are known in the art and can include FAM as a donorfluorophore for use with JOE, TAMRA, and ROX,3-(E-carboxy-pentyl)-3′-ethyl-5,5′-dimethyloxacarbocyanine (CYA) canserve as a donor fluorophore for rhodamine derivatives (such as R6G,TAMRA, and ROX) which can be used as acceptor fluorophores. Grant et al.(Biosens Bioelectron. 16:231-7, 2001) provide particular examples ofFRET pairs that can be used in the methods disclosed herein.

V. Samples

A sample is any collective comprising one or more targets, such as abiological sample or biological specimen. The sample can be collected orobtained using methods well known to those ordinarily skilled in the artThe samples of use in the disclosed methods can include any specimenthat includes nucleic acid (such as genomic DNA, cDNA, viral DNA or RNA,rRNA, tRNA, mRNA, miRNA, oligonucleotides, nucleic acid fragments,modified nucleic acids, synthetic nucleic acids, or the like). In oneexample, the sample includes unstable RNA. In some examples, the nucleicacid molecule to be detected or sequenced is cross-linked in the sample(such as a cross-linked DNA, mRNA, miRNA, or vRNA) or is soluble in thesample. In some examples, the sample is a fixed sample, such as a samplethat includes an agent that causes target molecule cross-linking. Insome examples, the target nucleic acid in the sample is not extracted,solubilized, or both, prior to detecting or sequencing the targetnucleic acid molecule.

In some examples, the disclosed methods include obtaining the sampleprior to analysis of the sample. In some examples, the disclosed methodsinclude selecting a subject having a tumor, and then in some examplesfurther selecting one or more target DNAs or RNAs to detect based on thesubject's tumor, for example, to determine a diagnosis or prognosis forthe subject or for selection of one or more therapies. In some examples,nucleic acid molecules in a sample to be analyzed are first isolated,extracted, concentrated, or combinations thereof, from the sample.

In some examples, RNA in the sample reverse transcribed prior toperforming the methods provided herein. However, the disclosed methodsdo not require reverse transcription, as the target RNA sequence iseffectively converted into a complementary probe sequence throughhybridization and nuclease activity. It is sometimes desirable tosequence RNA molecules rather than the gene sequences which encode theRNA, since RNA molecules are not necessarily co-linear with their DNAtemplate. And some organisms are RNA, such as RNA viruses.

In some examples, the sample is lysed. The lysis buffer is designed toinactivate enzymes and prevent the degradation of RNA, but after alimited dilution into a hybridization dilution buffer it permitsnuclease activity and facilitates hybridization with stringentspecificity. A dilution buffer can be added to neutralize the inhibitoryactivity of the lysis and other buffers, such as inhibitory activity forother enzymes (e.g., polymerase). Alternatively, the composition of thelysis buffer and other buffers can be changed to a composition that istolerated, for example by a polymerase.

In some examples, the methods include analyzing a plurality of samplessimultaneously or contemporaneously. For example, the methods cananalyze at least two different samples (for example from differentpatients) simultaneously or contemporaneously. In one example, themethods can detect or sequence at least two different target nucleicacid molecules (such as 2, 3, 4, 5, 6, 7, 8, 9 or 10 different targets)in at least two different samples (such as at least 5, at least 10, atleast 100, at least 500, at least 1000, or at least 10,000 differentsamples) simultaneously or contemporaneously.

Exemplary samples include, without limitation, cells, cell lysates,blood smears, cytocentrifuge preparations, cytology smears, bodilyfluids (e.g., blood and fractions thereof such as serum and plasma,saliva, sputum, urine, spinal fluid, gastric fluid, sweat, semen, etc.),cytological smears, buccal cells, extracts of tissues, cells or organs,tissue biopsies (e.g., tumor biopsies), fine-needle aspirates, punchbiopsies, circulating tumor cells, fresh tissue, frozen tissue, fixedtissue, fixed and wax- (e.g., paraffin-)embedded tissue, bone marrow,and/or tissue sections (e.g., cryostat tissue sections and/orparaffin-embedded tissue sections). The biological sample may also be alaboratory research sample such as a cell culture sample or supernatant.

Methods of obtaining a sample from a subject are known in the art. Forexample, methods of obtaining tissue or cell samples are routine.Exemplary samples may be obtained from normal cells or tissues, or fromneoplastic cells or tissues. Neoplasia is a biological condition inwhich one or more cells have undergone characteristic anaplasia withloss of differentiation, increased rate of growth, invasion ofsurrounding tissue, and which cells may be capable of metastasis. Inparticular examples, a biological sample includes a tumor sample, suchas a sample containing neoplastic cells.

Exemplary neoplastic cells or tissues may be included in or isolatedfrom solid tumors, including lung cancer (e.g., non-small cell lungcancer, such as lung squamous cell carcinoma), breast carcinomas (e.g.lobular and duct carcinomas), adrenocortical cancer, ameloblastoma,ampullary cancer, bladder cancer, bone cancer, cervical cancer,cholangioma, colorectal cancer, endometrial cancer, esophageal cancer,gastric cancer, glioma, granular call tumor, head and neck cancer,hepatocellular cancer, hydatiform mole, lymphoma, melanoma,mesothelioma, myeloma, neuroblastoma, oral cancer, osteochondroma,osteosarcoma, ovarian cancer, pancreatic cancer, pilomatricoma, prostatecancer, renal cell cancer, salivary gland tumor, soft tissue tumors,Spitz nevus, squamous cell cancer, teratoid cancer, and thyroid cancer.Exemplary neoplastic cells may also be included in or isolated fromhematological cancers including leukemias, including acute leukemias(such as acute lymphocytic leukemia, acute myelocytic leukemia, acutemyelogenous leukemia and myeloblastic, promyelocytic, myelomonocytic,monocytic and erythroleukemia), chronic leukemias (such as chronicmyelocytic (granulocytic) leukemia, chronic myelogenous leukemia, andchronic lymphocytic leukemia), polycythemia vera, lymphoma, Hodgkin'sdisease, non-Hodgkin's lymphoma (indolent and high grade forms),multiple myeloma, Waldenstrom's macroglobulinemia, heavy chain disease,myelodysplastic syndrome, and myelodysplasia.

For example, a sample from a tumor that contains cellular material canbe obtained by surgical excision of all or part of the tumor, bycollecting a fine needle aspirate from the tumor, as well as othermethods known in the art. In some examples, a tissue or cell sample isapplied to a substrate and analyzed to determine presence of one or moretarget DNAs or RNAs. A solid support useful in a disclosed method needonly bear the biological sample and, optionally, permit the convenientdetection of components (e.g., proteins and/or nucleic acid sequences)in the sample. Exemplary supports include microscope slides (e.g., glassmicroscope slides or plastic microscope slides), coverslips (e.g., glasscoverslips or plastic coverslips), tissue culture dishes, multi-wellplates, membranes (e.g., nitrocellulose or polyvinylidene fluoride(PVDF)) or BIACORE™ chips.

The disclosed methods are sensitive and specific and allow detection oftarget nucleic acid molecules in a sample containing even a limitednumber of cells. Samples that include small numbers of cells, such asless than 250,000 cells (for example less than 100,000, less than50,000, less than 10,000, less than 1,000, less than 500, less than 200,less than 100 cells, or less than 10 cells, include but are not limitedto, FFPE samples, fine needle aspirates (such as those from lung,prostate, lymph, breast, or liver), punch biopsies, needle biopsies,small populations of (e.g., FACS) sorted cells or circulating tumorcells, lung aspirates, small numbers of laser captured or macrodissectedcells or circulating tumor cells, exosomes and other subcellularparticles, or body fluids (such as plasma, serum, spinal fluid, saliva,and breast aspirates). For example, a target DNA or target RNA can bedetected in as few as 1000 cells (such as a sample including 1000 ormore cells, such as 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000,9000, 10,000, 15,000, 20,000, 50,000, or more cells). In some examples,expression of a target DNA or target RNA can be detected in about 1000to 100,000 cells, for example about 1000 to 50,000, 1000 to 15,000, 1000to 10,000, 1000 to 5000, 3000 to 50,000, 6000 to 30,000, or 10,000 to50,000 cells). In some examples, expression of a target DNA or targetRNA can be detected in about 100 to 250,000 cells, for example about 100to 100,000, 100 to 50,000, 100 to 10,000, 100 to 5000, 100 to 500, 100to 200, or 100 to 150 cells. In other examples, expression of a targetDNA or target RNA can be detected in about 1 to 1000 cells (such asabout 1 to 500 cells, about 1 to 250 cells, about 1 to 100 cells, about1 to 50 cells, about 1 to 25 cells, or about 1 cell).

Samples may be treated in a number of ways known to those of ordinaryskill in the art prior to (or contemporaneous with) contacting thesample with a target-specific reagent (such as a NPPF). One relativelysimple treatment is suspension of the sample in a buffer, e.g., lysisbuffer, which conserves all components of the sample in a singlesolution. Many traditional methods for detecting targets require morecomplex sample processing (e.g., involving multiple steps and/or varioustypes of specialized instruments) to make the target accessible to atarget-specific reagent(s). For example, certain detection methodsrequire partial or complete isolation (e.g., extraction) of a target(e.g., DNA or mRNA) from the sample. A target (such as, DNA or RNA) hasbeen isolated or extracted when it is purified away from othernon-target biological components in a sample. Purification refers toseparating the target from one or more extraneous components also foundin a sample. For example, prior to PCR-based detection of mRNA withpaired target-specific primers, total or soluble mRNA (including thetarget mRNA) often is separated from cell proteins and other nucleicacids in the sample. Components that are isolated, extracted or purifiedfrom a mixed specimen or sample typically are enriched by at least 50%,at least 60%, at least 75%, at least 90%, or at least 98% or even atleast 99% compared to the unpurified or non-extracted sample.

Isolation of biological components from a sample is time consuming andbears the risk of loss of the component that is being isolated, e.g., bydegradation and/or poor efficiency or incompleteness of the process(es)used for isolation. Moreover, with some samples, such as fixed tissues,targets (such as DNA or RNA (e.g., mRNA or miRNA)) are notoriouslydifficult to isolate with high fidelity (e.g., as compared to fresh orfrozen tissues) because, it is thought that, at least some proportion ofthe targets are cross-linked to other components in the fixed sampleand, therefore, cannot be readily isolated or solubilized and may belost upon separation of soluble and insoluble fractions. Accordingly, insome examples, methods of detecting a target nucleic acid do not requireor involve purification, extraction or isolation of a target from asample prior to contacting the sample with one or more NPPFs, and/orinvolve only suspending the sample in a solution, e.g., lysis buffer,that retains all components of the sample prior to contacting the samplewith a target-specific reagent.

In some examples, cells in the sample are lysed or permeabilized in anaqueous solution (for example using a lysis buffer). The aqueoussolution or lysis buffer includes detergent (such as sodium dodecylsulfate) and one or more chaotropic agents (such as formamide,guanidinium HCl, guanidinium isothiocyanate, or urea). The solution mayalso contain a buffer (for example SSC). In some examples, the lysisbuffer includes about 15% to 25% formamide (v/v) about 0.01% to 0.1%SDS, and about 0.5-6×SSC (for example, about 3×SSC). The buffer mayoptionally include tRNA (for example, about 0.001 to about 2.0 mg/ml) ora ribonuclease; DNAase; proteinase K; enzymes (e.g. collagenase orlipase) that degrade protein, matrix, carbohydrate, lipids, or onespecies of oligonucleotides, or combinations thereof. The lysis buffermay also include a pH indicator, such as Phenol Red. In a particularexample, the lysis buffer includes 20% formamide, 3×SSC (79.5%), 0.05%SDS, 1 μg/ml tRNA, and 1 mg/ml Phenol Red. Cells are incubated in theaqueous solution (optionally overlayed with oil to prevent evaporationor to serve as a sink for paraffin) for a sufficient period of time(such as about 1 minute to about 60 minutes, for example about 5 minutesto about 20 minutes, or about 10 minutes) and at a sufficienttemperature (such as about 22° C. to about 110° C., for example, about80° C. to about 105° C., about 37° C. to about 105° C., or about 90° C.to about 100° C.) to lyse or permeabilize the cell. In some examples,lysis is performed at about 95° C. In some examples, the lysis stepincludes incubating the sample at about 95° C. for about 5-15 minutes todenature RNA in the sample, but not genomic DNA. In other examples, thelysis step includes incubating the sample at about 105° C. for about5-15 minutes to denature both RNA and genomic DNA in the sample. In oneexample Proteinase K is included with the lysis buffer.

In some examples, the crude cell lysis is used directly without furtherpurification. The cells may be lysed in the presence or absence of oneor more of the disclosed NPPFs. If the cells are lysed in the absence ofprobe, the one or more probes can be subsequently added to the crudelysate. In other examples, nucleic acids (such as DNA and/or RNA) areisolated from the cell lysate prior to contacting the lysate prior tocontacting with one or more NPPFs.

In other examples, tissue samples are prepared by fixing and embeddingthe tissue in a medium or include a cell suspension is prepared as amonolayer on a solid support (such as a glass slide), for example bysmearing or centrifuging cells onto the solid support. In furtherexamples, fresh frozen (for example, unfixed) tissue or tissue sectionsmay be used in the methods disclosed herein. In particular examples,FFPE tissue sections are used in the disclosed methods.

In some examples an embedding medium is used. An embedding medium is aninert material in which tissues and/or cells are embedded to helppreserve them for future analysis. Embedding also enables tissue samplesto be sliced into thin sections. Embedding media include paraffin,celloidin, OCT™ compound, agar, plastics, or acrylics. Many embeddingmedia are hydrophobic; therefore, the inert material may need to beremoved prior to analysis, which utilizes primarily hydrophilicreagents. The term deparaffinization or dewaxing is broadly used hereinto refer to the partial or complete removal of any type of embeddingmedium from a biological sample. For example, paraffin-embedded tissuesections are dewaxed by passage through organic solvents, such astoluene, xylene, limonene, or other suitable solvents. In otherexamples, paraffin-embedded tissue sections are utilized directly (e.g.,without a dewaxing step).

Tissues can be fixed by any suitable process, including perfusion or bysubmersion in a fixative. Fixatives can be classified as cross-linkingagents (such as aldehydes, e.g., formaldehyde, paraformaldehyde, andglutaraldehyde, as well as non-aldehyde cross-linking agents), oxidizingagents (e.g., metallic ions and complexes, such as osmium tetroxide andchromic acid), protein-denaturing agents (e.g., acetic acid, methanol,and ethanol), fixatives of unknown mechanism (e.g., mercuric chloride,acetone, and picric acid), combination reagents (e.g., Carnoy'sfixative, methacarn, Bouin's fluid, B5 fixative, Rossman's fluid, andGendre's fluid), microwaves, and miscellaneous fixatives (e.g., excludedvolume fixation and vapor fixation). Additives may also be included inthe fixative, such as buffers, detergents, tannic acid, phenol, metalsalts (such as zinc chloride, zinc sulfate, and lithium salts), andlanthanum. The most commonly used fixative in preparing tissue or cellsamples is formaldehyde, generally in the form of a formalin solution(4% formaldehyde in a buffer solution, referred to as 10% bufferedformalin). In one example, the fixative is 10% neutral bufferedformalin, and thus in some examples the sample is formalin fixed.

In some examples, the sample is an environmental sample (such as a soil,air, or water sample, or a sample obtained from a surface (for exampleby swabbing)), or a food sample (such as a vegetable, fruit, dairy ormeat containing sample) for example to detect pathogens that may bepresent.

VI. Target Nucleic Acids

A target nucleic acid molecule is a nucleic acid molecule that iscapable of detection, or of interest, or useful to detect with thedisclosed methods. Targets include single-, double- or othermultiple-stranded nucleic acid molecules (such as, DNA (e.g., genomic,mitochondrial, or synthetic), RNA (such as mRNA, miRNA, tRNA, siRNA,long non-coding (nc) RNA, biologically occurring anti-sense RNA,Piwi-interacting RNAs (piRNAs), or small nucleolar RNAs (snoRNAs)),whether from eukaryotes, prokaryotes, viruses, fungi, bacteria or otherbiological organism. Genomic DNA targets may include one or severalparts of the genome, such as coding regions (e.g., genes or exons),non-coding regions (whether having known or unknown biological function,e.g., enhancers, promoters, regulatory regions, telomeres, or “nonsense”DNA). In some embodiments, a target may contain or be the result of amutation (e.g., germ line or somatic mutation) that may be naturallyoccurring or otherwise induced (e.g., chemically or radiation-inducedmutation). Such mutations may include (or result from) genomicrearrangements (such as translocations, insertions, deletions, orinversions), single nucleotide variations, and/or genomicamplifications. In some embodiments, a target may contain one or moremodified or synthetic monomers units (e.g., peptide nucleic acid (PNA),locked nucleic acid (LNA), methylated nucleic acid, post-translationallymodified amino acid, cross-linked nucleic acid or cross-linked aminoacid).

The portion of a target nucleic acid molecule to which a NPPF mayspecifically bind also may be referred to as “target,” again, as contextdictates, but more specifically may be referred to as target portion,complementary region (CR), target site, protected target region orprotected site, or similar. A NPPF specifically bound to itscomplementary region forms a complex, which complex may remainintegrated with the target as a whole and/or sample, or be separate (orbe or become separated) from the target as a whole and/or the sample. Insome embodiments, a NPPF/CR complex is separated (or becomesdisassociated) from the target as a whole and/or the sample, e.g., bythe action of a nuclease, such as S1 nuclease.

All types of target nucleic acid molecules can be analyzed using thedisclosed methods. In one example, the target is a ribonucleic acid(RNA) molecule, such as a messenger RNA (mRNA), a ribosomal RNA (rRNA),a transfer RNA (tRNA), micro RNA (miRNA), an siRNA, anti-sense RNA, or aviral RNA (vRNA). In another example, the target is a deoxyribonucleic(DNA) molecule, such as genomic DNA (gDNA), mitochondrial DNA (mtDNA),chloroplast DNA (cpDNA), viral DNA (vDNA), cDNA, or a transfected DNA.In a specific example, the target is an antisense nucleotide. In someexamples, the whole transcriptome of a cell or a tissue can be analyzedusing the disclosed methods. In one example, the target nucleic acidmolecule is a rare nucleic acid molecule, for example only appearingless than about 100,000 times, less than about 10,000 times, less thanabout 5,000 times, less than about 100 times, less than 10 times, oronly once in the sample, such as a nucleic acid molecule only appearing1 to 10,000, 1 to 5,000, 1 to 100 or 1 to 10 times in the sample).

A plurality of targets can be detected or sequenced in the same sampleor assay, or even in multiple samples or assays, for examplesimultaneously or contemporaneously. Similarly, a single target can bedetected or sequenced in a plurality of samples, for examplesimultaneously or contemporaneously. In one example the target nucleicacid molecule is an miRNA and an mRNA. Thus, in such an example, themethod would include the use of at least one NPPF specific for the miRNAand at least one NPPF specific for the mRNA. In one example the targetnucleic acid molecules are two different DNA molecules. Thus, in such anexample, the method would include the use of at least one NPPF specificfor the first target DNA and at least one NPPF specific for the secondtarget DNA. In one example the target nucleic acid molecules are twodifferent RNA molecules. Thus, in such an example, the method wouldinclude the use of at least one NPPF specific for the first target RNAand at least one NPPF specific for the second target RNA.

In some examples, the disclosed methods permit detection or sequencingof DNA or RNA single nucleotide polymorphisms (SNPs) or variants(sNPVs), splice junctions, methylated DNA, gene fusions or othermutations, protein-bound DNA or RNA, and also cDNA, as well as levels ofexpression (such as DNA or RNA expression, such as cDNA expression, mRNAexpression, miRNA expression, rRNA expression, siRNA expression, or tRNAexpression). Any nucleic acid molecule to which a nuclease protectionprobe can be designed to hybridize can be quantified and identified bythe disclosed methods, even though the target nucleic acid moleculesthemselves need not be sequenced and are even in some examplesdestroyed.

In one example, DNA methylation is detected by using an NPPF thatincludes a base mis-match at the site where methylation has or has notoccurred, such that upon treatment of the target sample, methylatedbases are converted to a different base, complementary to the base inthe NPPF.

One skilled in the art will appreciate that the target can includenatural or unnatural bases, or combinations thereof.

In specific non-limiting examples, a target nucleic acid (such as atarget DNA or target RNA) associated with a neoplasm (for example, acancer) is selected. Numerous chromosome abnormalities (includingtranslocations and other rearrangements, reduplication or deletion) ormutations have been identified in neoplastic cells, especially in cancercells, such as B cell and T cell leukemias, lymphomas, breast cancer,colon cancer, neurological cancers and the like.

In some examples, a target nucleic acid molecule includes GAPDH (e.g.,GenBank Accession No. NM_(—)002046), PPIA (e.g., GenBank Accession No.NM_(—)021130), RPLP0 (e.g., GenBank Accession Nos. NM_(—)001002 orNM_(—)053275), RPL19 (e.g., GenBank Accession No. NM_(—)000981), ZEB1(e.g., GenBank Accession No. NM_(—)030751), Zeb2 (e.g., GenBankAccession Nos. NM_(—)001171653 or NM_(—)014795), CDH1 (e.g., GenBankAccession No. NM_(—)004360), CDH2 (e.g., GenBank Accession No.NM_(—)007664), VIM (e.g., GenBank Accession No. NM_(—)003380), ACTA2(e.g., GenBank Accession No. NM_(—)001141945 or NM_(—)001613), CTNNB1(e.g., GenBank Accession No. NM_(—)001904, NM_(—)001098209, orNM_(—)001098210), KRT8 (e.g., GenBank Accession No. NM_(—)002273), SNAI1(e.g., GenBank Accession No. NM_(—)005985), SNAI2 (e.g., GenBankAccession No. NM_(—)003068), TWIST1 (e.g., GenBank Accession No.NM_(—)000474), CD44 (e.g., GenBank Accession No. NM_(—)000610,NM_(—)001001389, NM_(—)00100390, NM_(—)001202555, NM_(—)001001391,NM_(—)001202556, NM_(—)001001392, NM_(—)001202557), CD24 (e.g., GenBankAccession No. NM_(—)013230), FN1 (e.g., GenBank Accession No.NM_(—)212474, NM_(—)212476, NM_(—)212478, NM_(—)002026, NM_(—)212482,NM_(—)054034), IL6 (e.g., GenBank Accession No. NM_(—)000600), MYC(e.g., GenBank Accession No. NM_(—)002467), VEGFA (e.g., GenBankAccession No. NM_(—)001025366, NM_(—)001171623, NM_(—)003376,NM_(—)001171624, NM_(—)001204384, NM_(—)001204385, NM_(—)001025367,NM_(—)001171625, NM_(—)001025368, NM_(—)001171626, NM_(—)001033756,NM_(—)001171627, NM_(—)001025370, NM_(—)001171628, NM_(—)001171622,NM_(—)001171630), HIF1A (e.g., GenBank Accession No. NM_(—)001530,NM_(—)181054), EPAS1 (e.g., GenBank Accession No. NM_(—)001430), ESR2(e.g., GenBank Accession No. NM_(—)001040276, NM_(—)001040275,NM_(—)001214902, NM_(—)001437, NM_(—)001214903), PRKCE (e.g., GenBankAccession No. NM_(—)005400), EZH2 (e.g., GenBank Accession No.NM_(—)001203248, NM_(—)152998, NM_(—)001203247, NM_(—)004456,NM_(—)001203249), DAB21P (e.g., GenBank Accession No. NM_(—)032552,NM_(—)138709), B2M (e.g., GenBank Accession No. NM_(—)004048), and SDHA(e.g., GenBank Accession No. NM_(—)004168).

In some examples, a target miRNA includes hsa-miR-205 (MIR205, e.g.,GenBank Accession No. NR_(—)029622), hsa-miR-324 (MIR324, e.g., GenBankAccession No. NR_(—)029896), hsa-miR-301a (MIR301A, e.g., GenBankAccession No. NR_(—)029842), hsa-miR-106b (MIR106B, e.g., GenBankAccession No. NR_(—)029831), hsa-miR-877 (MIR877, e.g., GenBankAccession No. NR_(—)030615), hsa-miR-339 (MIR339, e.g., GenBankAccession No. NR_(—)029898), hsa-miR-10b (MIR10B, e.g., GenBankAccession No. NR_(—)029609), hsa-miR-185 (MIR185, e.g., GenBankAccession No. NR_(—)029706), hsa-miR-27b (MIR27B, e.g., GenBankAccession No. NR_(—)029665), hsa-miR-492 (MIR492, e.g., GenBankAccession No. NR_(—)030171), hsa-miR-146a (MIR146A, e.g., GenBankAccession No. NR_(—)029701), hsa-miR-200a (MIR200A, e.g., GenBankAccession No. NR_(—)029834), hsa-miR-30c (e.g., GenBank Accession No.NR_(—)029833, NR_(—)029598), hsa-miR-29c (MIR29C, e.g., GenBankAccession No. NR_(—)029832), hsa-miR-191 (MIR191, e.g., GenBankAccession No. NR_(—)029690), or hsa-miR-655 (MIR655, e.g., GenBankAccession No. NR_(—)030391).

In one example the target is a pathogen nucleic acid, such as viral RNAor DNA. Exemplary pathogens include, but are not limited to, viruses,bacteria, fungi, and protozoa. In one example, the target is a viralRNA. Viruses include positive-strand RNA viruses and negative-strand RNAviruses. Exemplary positive-strand RNA viruses include, but are notlimited to: Picornaviruses (such as Aphthoviridae [for examplefoot-and-mouth-disease virus (FMDV)]), Cardioviridae; Enteroviridae(such as Coxsackie viruses, Echoviruses, Enteroviruses, andPolioviruses); Rhinoviridae (Rhinoviruses)); Hepataviridae (Hepatitis Aviruses); Togaviruses (examples of which include rubella; alphaviruses(such as Western equine encephalitis virus, Eastern equine encephalitisvirus, and Venezuelan equine encephalitis virus)); Flaviviruses(examples of which include Dengue virus, West Nile virus, and Japaneseencephalitis virus); and Coronaviruses (examples of which include SARScoronaviruses, such as the Urbani strain). Exemplary negative-strand RNAviruses include, but are not limited to: Orthomyxyoviruses (such as theinfluenza virus), Rhabdoviruses (such as Rabies virus), andParamyxoviruses (examples of which include measles virus, respiratorysyncytial virus, and parainfluenza viruses). In one example the targetis viral DNA from a DNA virus, such as Herpesviruses (such asVaricella-zoster virus, for example the Oka strain; cytomegalovirus; andHerpes simplex virus (HSV) types 1 and 2), Adenoviruses (such asAdenovirus type 1 and Adenovirus type 41), Poxviruses (such as Vacciniavirus), and Parvoviruses (such as Parvovirus B19). In another example,the target is a retroviral nucleic acid, such as one from humanimmunodeficiency virus type 1 (HIV-1), such as subtype C, HIV-2; equineinfectious anemia virus; feline immunodeficiency virus (FIV); felineleukemia viruses (FeLV); simian immunodeficiency virus (SIV); and aviansarcoma virus. In one example, the target nucleic acid is a bacterialnucleic acid. In one example the bacterial nucleic acid is from agram-negative bacteria, such as Escherichia coli (K-12 and O157:H7),Shigella dysenteriae, and Vibrio cholerae. In another example thebacterial nucleic acid is from a gram-positive bacteria, such asBacillus anthracis, Staphylococcus aureus, pneumococcus, gonococcus, andstreptococcal meningitis. In one example, the target nucleic acid is anucleic acid from protozoa, nemotodes, or fungi. Exemplary protozoainclude, but are not limited to, Plasmodium, Leishmania, Acanthamoeba,Giardia, Entamoeba, Cryptosporidium, Isospora, Balantidium, Trichomonas,Trypanosoma, Naegleria, and Toxoplasma. Exemplary fungi include, but arenot limited to, Coccidiodes immitis and Blastomyces dermatitidis.

One of skill in the art can identify additional target DNAs or RNAsand/or additional target miRNAs which can be detected utilizing themethods disclosed herein.

VII. Assay Output

In some embodiments, the disclosed methods include determining presenceor an amount of one or more target nucleic acid molecules in a sample.In other or additional embodiments, the disclosed methods includedetermining the sequence of one or more target nucleic acid molecules ina sample, which can include quantification of sequences detected. Theresults of the methods can be provided to a user (such as a scientist,clinician or other health care worker, laboratory personnel, or patient)in a perceivable output that provides information about the results ofthe test. In some examples, the output can be a paper output (forexample, a written or printed output), a display on a screen, agraphical output (for example, a graph, chart, or other diagram), or anaudible output. In one example, the output is a table or graph includinga qualitative or quantitative indicator of presence or amount (such as anormalized amount) of a target nucleic acid molecule detected (or notdetected) in the sample. In other examples the output is a map or imageof signal present on a substrate (for example, a digital image offluorescence from an array). In other examples, the embodiments, theoutput is the sequence of one or more target nucleic acid molecules in asample, such a report indicting the presence of a particular mutation inthe target molecule.

In some examples, the output is a numerical value, such as an amount ofa target nucleic acid molecule in a sample. In additional examples, theoutput is a graphical representation, for example, a graph thatindicates the value (such as amount or relative amount) of a targetnucleic acid molecule in the sample on a standard curve. In additionalexamples, the output is a graphical representation, for example, a graphthat indicates the sequence of a target nucleic acid molecule in thesample (for example which might indicate where a mutation is present).In some examples, the output is communicated to the user, for example byproviding an output via physical, audible, or electronic means (forexample by mail, telephone, facsimile transmission, email, orcommunication to an electronic medical record).

The output can provide quantitative information (for example, an amountof a particular target nucleic acid molecule or an amount of aparticular target nucleic acid molecule relative to a control sample orvalue) or can provide qualitative information (for example, adetermination of presence or absence of a particular target nucleic acidmolecule). In additional examples, the output can provide qualitativeinformation regarding the relative amount of a target nucleic acidmolecule in the sample, such as identifying an increase or decreaserelative to a control or no change relative to a control.

As discussed herein the NPPF amplicons can include one or moreexperiment tags, which can be used for example to identify a particularpatient, sample, experiment, or target sequence. The use of such tagspermits the detected or sequenced NPPF amplicon to be “sorted” or evencounted, and thus permits analysis of multiple different samples (forexample from different patients), multiple different targets (forexample at least two different nucleic acid targets), or combinationsthereof, in a single reaction. In one example, Illumina and Bowtiesoftware can be used for such analysis.

In one example, the NPPFs include an experiment tag unique for eachdifferent target nucleic acid molecule. The use of such a tag allows oneto merely sequence or detect this tag, without sequencing the entireNPPF, to identify the NPPF as corresponding to a particular nucleic acidtarget. In addition, if multiple nucleic acid targets are to beanalyzed, the use of a unique experiment tag for each target simplifiesthe analysis, as each detected or sequenced experiment tag can besorted, and if desired counted. This permits for quantification of thetarget nucleic acid that was in the sample, as the NPPF amplicons are instoichiometric proportion to the target in the sample. For example ifmultiple target nucleic acids are detected or sequenced in a sample, themethods permit the generation of a table or graph showing each targetsequence and the number of copies detected or sequenced, by simplydetecting or sequencing and then sorting the experimental tag.

In another example, the NPPFs include an experiment tag unique for eachdifferent sample (such as a unique tag for each patient sample). The useof such a tag allows one to associate a particular detected NPPFamplicon with a particular sample. Thus, if multiple samples areanalyzed in the same reaction (such as the same well or same sequencingreaction), the use of a unique experiment tag for each sample simplifiesthe analysis, as each detected or sequenced NPPF can be associated witha particular sample. For example if a target nucleic acid is detected orsequenced in samples, the methods permit the generation of a table orgraph showing the result of the analysis for each sample.

One skilled in the art will appreciate that each NPPF amplicon caninclude a plurality of experiment tags (such as at least 2, 3, 4, 5, 6,7, 8, 9 or 10 experiment tags), such as a tag representing the targetsequence, and another representing the sample. Once each tag is detectedor sequenced, appropriate software can be used to sort the data in anydesired format, such as a graph or table. For example, this permitsanalysis of multiple target sequences in multiple samples simultaneouslyor contemporaneously.

In some examples, the detected or sequenced NPPF amplicon is compared toa reference database of known sequences for each target nucleic acidsequence. In some examples, such a comparison permits detection ofmutations, such as SNPs. In some examples, such a comparison permits fora comparison of a reference NPPF's abundance to the abundance of an NPPFprobe in a region known to contain SNP's.

The disclosure is further illustrated by the following non-limitingExamples.

Example 1 Simultaneous Sequencing of a Plurality of NPPFs

This example describes methods used to generate and sequence NPPFs.

Seven different NPPFs were generated. Each NPPF included a region thatwas specific for a particular target nucleic acid molecule 25nucleotides in length with a median Tm of 62° C., as well as flankingsequences on both ends. Although the 5′- and 3′-flanking sequencesdiffered, they were the same for each of the seven different NPPFs. The5′-flanking sequence was 25 nucleotides with a Tm of 61° C. and the3′-flanking sequence was 25 nucleotides with a Tm of 63° C.

The seven different NPPFs were pooled at known ratios (1:1.5:2:4:5) andPCR amplified as follows. The NPPFs were incubated with PCR primers. Oneprimer included a sequence that was complementary to the 5′-flankingsequence and the second primer included a sequence that wascomplementary to the 3′-flanking sequence. The second primer alsoincluded a sequence to allow for incorporation of a six nucleotideexperiment tag into the resulting amplicon, so that each NPPF amplifiedusing this primer had the same six nucleotide experiment tag. Severalsuch reactions were carried out, each with a different tag. The firstprimer was 49 bases in length. Twenty of these bases were identical tothe 5′-flanking sequence. These 20 bases had a Tm of 54° C. and theoverall Tm of the entire primer was 70° C. The second primercomplementary to the 3′-flanking sequence was 57 nucleotides total witha Tm of about 70° C. The first 19 nucleotides of the second primer wereexactly complementary to the 3′-flanking region and had a Tm of 54° C.

Eight separate PCR reactions were run, so that variances could beidentified. The resulting amplicons were cleaned up using either gelpurification or standard column-based purification (Qiagen QIAQuick spincolumns). The amplicons containing the NPPF and an experimental tag werethen sequenced using Illumina platform. Each amplicon sequenced wassorted based on the experiment tag sequence—each tag represented onereplicate pool of the seven NPPFs. Within each experiment tag group, thenumber of amplicons identified for each of the seven tags was counted.

128 million amplicons were sequenced, and of those, 110 million (87%)resulted in a perfectly sequenced experiment tag. The amplicons werecompared to the expected sequences using Bowtie, resulting in about 80%prefect-match sequences. This is a good percentage of perfect-matchsequences for the Illumina system, based on their published error andquality specs. FIG. 5 shows the number of amplicons detected for each ofthe seven unique NPPFs corresponding to the original ratio of NPPFpooled prior to PCR. The probes were measured in eight separateexperiments, each of which had a different experimental tag added duringamplification. These were all pooled into a single channel of thesequencer and sequenced. The error bars indicate the reproducibility (1SD) of the eight experiments

FIG. 6, a replot of the data shown in FIG. 5, shows the eight individualexperimental results for each probe, the average (without error bars,same average as depicted in FIG. 5 with error bars), and the expected(based on the amount of NPPF added to the sample). The ratios observedfor each of the seven NPPFs matched those expected (based on theoriginal amount of NPPF added to the PCR reaction).

Example 2 Simultaneous Detection of a Plurality of NPPFs

This example describes methods used to generate and detect NPPFs usingan array, and quantification of the degree of amplification achieved.

Three different NPPFs were generated (one containing a sequencecomplementary to the human BAX gene, one containing an EML4-ALK fusiongene complementary sequence, and one containing an EML4 complementarysequence). Thus, each NPPF included a region that was specific for aparticular target nucleic acid molecule, as well as flanking sequenceson both ends. The 5′-end of each NPPF had a biotin label. The NPPFs were25 nucleotides, having a Tm of 63° C. to 65° C. Although the 5′- and3′-flanking sequences differed, they were the same for each of the sevendifferent NPPFs. The 5′-flanking sequence was 25 nucleotides with a Tmof 61° C. and the 3′-flanking sequence was 25 nucleotides with a Tm of63° C.

In one experiment, the unamplified NPPFs were hybridized to an arrayfollowing qNPA. The array included an anchor probe bound to bifunctionallinkers. One half of the bifunctional linker is complementary to theanchor, and the other half is complementary to the gene-specific part ofthe NPPF. The linker thus forms a bridge between the anchor and theNPPF. The three different NPPFs were pooled at known ratios, andhybridized to synthetic RNAs containing the target sequences ofinterest, as well to CFSs complementary to the flanking regions on theNPPFs. Following S1-mediated digestion of unhybridized RNA, NPPFs, andCFSs, the reaction was split. One fraction was incubated with the arrayunder conditions to permit the NPPFs to bind to their appropriatebifunctional linker. Binding of the NPPF to the array was detected bythe biotin label present on the NPPF using fluorescentstreptavadin-phycoerythrin.

In another experiment, another fraction of the pooled reaction was PCRamplified prior to hybridization to an array, and the product wasdiluted 1:10 or 1:100 before hybridization to the array. For PCRamplification the reaction containing NPPFs were incubated with PCRprimers. One primer included a sequence that was identical to the5′-flanking sequence (and included a biotin label) and the second primerincluded a sequence that was complementary to the 3′-flanking sequence.The first primer complementary to the 5′-flanking sequence was 22nucleotides and had a Tm of 59° C., and the second primer complementaryto the 3′-flanking sequence was 22 nucleotides and had a Tm of 56° C.The advantage of using the NPPFs which have the same flanking sequences(but different target-specific regions) is that the flanking sequencespermit the use of universal PCR primers, such that only a single5′-primer sequence and a single 3′-primer sequence are needed to amplifya plural of different NPPF sequences. The NPPF amplicons were diluted1:10 or 1:100 then hybridized to the array and detected as describedabove.

As shown in FIG. 7, the use of PCR amplification, prior to hybridizationcapture, increases the sensitivity by at least 150-fold (taking intoaccount the dilution of the amplicons following the PCR step).

Example 3 Simultaneous Sequencing of a Plurality of NPPFs Designed toMeasure mRNAs or miRNAs

This example describes methods used to generate and sequence NPPFs.

Two sets of NPPFs were generated. In the first set, forty-six differentNPPFs were generated. Each NPPF included a region that was specific fora particular target nucleic acid molecule 25 nucleotides in length witha median Tm of 56° C., as well as flanking sequences on both ends. Forthe second set, thirteen different NPPFs were generated. Each NPPFincluded a region that was specific for a particular miRNA targetnucleic acid molecule 18-25 nucleotides in length with a median Tm of51° C., as well as flanking sequences on both ends.

For all NPPFs, regardless of target, although the 5′- and 3′-flankingsequences differed, they were the same for each of the different NPPFs.The 5′-flanking sequence (5′-AGTTCAGACGTGTGCTCTTCCGATC 3′; SEQ ID NO:17) was 25 nucleotides with a Tm of 61° C. and the 3′-flanking sequence(5′GATCGTCGGACTGTAGAACTCTGAA 3′; SEQ ID NO: 18) was 25 nucleotides witha Tm of 63° C.

qNPA was performed on lysates from two cell lines at differentconcentrations, using these NPPFs as probes. FIG. 10 shows the qNPAreactions, the samples used as input material, and the experiment tagsadded prior to sequencing. Reactions were performed in triplicate foreach cell concentration. Some experiment tags were not recognized by thesequencer software and thus the reactions tagged with those experimenttags were not considered in this analysis. The different NPPFs werepooled, and hybridized to the RNA of a cell lysate, as well as to CFSscomplementary to the flanking regions on the NPPFs. Hybridization wasperformed at 50° C. for the forty-six NPPFs from set 1, but performed at37° C. for the thirteen NPPFs from set 2. The difference in temperaturetakes into account the shorter length of the miRNA NPPFs and theircorresponding lower Tms.

Following S1-mediated digestion of unhybridized RNA, NPPFs, and CFSs,the reaction was neutralized by addition of 1M Tris pH 9.0 and the S1nuclease was inactivated by heating to 95° C. for 20 minutes. Eachresulting reaction, which contained NPPFs as representatives of theoriginal transcripts in the sample, was then incubated with PCR primers.One primer included a sequence that was complementary to the 5′-flankingsequence and the second primer included a sequence that wascomplementary to the 3′-flanking sequence. The second primer alsoincluded a sequence to allow for incorporation of a six nucleotideexperiment tag into the resulting amplicon, so that each NPPF amplifiedusing this primer had the same six nucleotide experiment tag.

The first primer (5′-AATGATACGGCGACCACCGACAGGTTCAGAGTTCTACAGTCCGACGATC3′; SEQ ID NO: 19) was 49 bases in length. Twenty of these bases wereidentical to the 5′-flanking sequence. These 20 bases had a Tm of 54° C.and the overall Tm of the entire primer was 70° C. The second primer,(5′-CAAGCAGAAGACGGCATACGAGATnnnnnnGTGACTGGAGTTCAGACGTGTG CTCTT 3′; SEQID NO: 20) complementary to the 3′-flanking sequence was 57 nucleotidestotal with a Tm of about 70° C. The first 19 nucleotides of the secondprimer were exactly complementary to the 3′-flanking region and had a Tmof 54° C. The six bases marked with “nnnnnn” above were one of thefollowing 24 sequences in Table 2. The resulting sequence is shown inthe right column, with its SEQ ID NO: in parenthesis.

TABLE 2  Sequence of Primers and Barcodes Barcode  sequence (nnnnnn  in SEQ ID   Resulting Primer Sequence   NO: 20)with Barcode (SEQ ID NO: ) ATCACG CAAGCAGAAGACGGCATACGAGATTCACGGTGACTGGAGTTCAGACGTGTGCTCTT (21) CGATGT CAAGCAGAAGACGGCATACGAGATCGATGTGTGACTGGAGTTCAGACGTGTGCTCTT (22) TTAGGC CAAGCAGAAGACGGCATACGAGATTTAGGCGTGACTGGAGTTCAGACGTGTGCTCTT (23) TGACCA CAAGCAGAAGACGGCATACGAGATTGACCAGTGACTGGAGTTCAGACGTGTGCTCTT (24) ACAGTG CAAGCAGAAGACGGCATACGAGATACAGTGGTGACTGGAGTTCAGACGTGTGCTCTT (25) GCCAAT CAAGCAGAAGACGGCATACGAGATGCCAATGTGACTGGAGTTCAGACGTGTGCTCTT (26) CAGATC CAAGCAGAAGACGGCATACGAGATCAGATCGTGACTGGAGTTCAGACGTGTGCTCTT (27) ACTTGA CAAGCAGAAGACGGCATACGAGATACTTGAGTGACTGGAGTTCAGACGTGTGCTCTT (28) GATCAG CAAGCAGAAGACGGCATACGAGATGATCAGGTGACTGGAGTTCAGACGTGTGCTCTT (29) TAGCTT CAAGCAGAAGACGGCATACGAGATTAGCTTGTGACTGGAGTTCAGACGTGTGCTCTT (30) GGCTAC CAAGCAGAAGACGGCATACGAGATGGCTACGTGACTGGAGTTCAGACGTGTGCTCTT (31) CTTGTA CAAGCAGAAGACGGCATACGAGATCTTGTAGTGACTGGAGTTCAGACGTGTGCTCTT (32) AGTCAA CAAGCAGAAGACGGCATACGAGATAGTCAAGTGACTGGAGTTCAGACGTGTGCTCTT (33) AGTTCC CAAGCAGAAGACGGCATACGAGATAGTTCCGTGACTGGAGTTCAGACGTGTGCTCTT (34) ATGTCA CAAGCAGAAGACGGCATACGAGATATGTCAGTGACTGGAGTTCAGACGTGTGCTCTT (35) CCGTCC CAAGCAGAAGACGGCATACGAGATCCGTCCGTGACTGGAGTTCAGACGTGTGCTCTT (36) GTAGAG CAAGCAGAAGACGGCATACGAGATGTAGAGGTGACTGGAGTTCAGACGTGTGCTCTT (37) GTCCGC CAAGCAGAAGACGGCATACGAGATGTCCGCGTGACTGGAGTTCAGACGTGTGCTCTT (38) GTGAAA CAAGCAGAAGACGGCATACGAGATGTGAAAGTGACTGGAGTTCAGACGTGTGCTCTT (39) GTGGCC CAAGCAGAAGACGGCATACGAGATGTGGCCGTGACTGGAGTTCAGACGTGTGCTCTT (40) GTTTCG CAAGCAGAAGACGGCATACGAGATGTTTCGGTGACTGGAGTTCAGACGTGTGCTCTT (41) CGTACG CAAGCAGAAGACGGCATACGAGATCGTACGGTGACTGGAGTTCAGACGTGTGCTCTT (42) GAGTGG CAAGCAGAAGACGGCATACGAGATGAGTGGGTGACTGGAGTTCAGACGTGTGCTCTT (43) GGTAGC CAAGCAGAAGACGGCATACGAGATGGTAGCGTGACTGGAGTTCAGACGTGTGCTCTT (44)

Each triplicate reaction was amplified in a separate PCR reaction, andhad a separate experimental tag, so that variance could be identified(see FIG. 10). The resulting amplicons were cleaned up using either gelpurification or standard column-based purification (Qiagen QIAQuick spincolumns). The amplicons containing the NPPF and an experimental tag werethen sequenced using an Illumina platform. While the experimental tagcan be located in several places, in this example, it was located at the3′-end of the amplicon, immediately downstream of a region complimentaryto an index-read sequencing primer. Illumina sequencing was thus done intwo steps, an initial read of the sequence followed by a second read ofthe experimental tag using a second sequencing primer. Using twosequencing primers in this manner is one standard method formultiplexing samples on the Illumina platform.

Each amplicon sequenced was sorted based first on the experiment tag(barcode), and then within each experiment tag group, the number ofamplicons identified for each of the different tags was counted. Theamplicons were compared to the expected sequences using Bowtie.

FIG. 11 shows the results from triplicate qNPA reactions on THP1 cells,using the 46 mRNA NPPFs. Excellent reproducibility was observed betweenreplicates and CVs are low. The graph represents the number of ampliconsdetected for each of the forty-six unique NPPFs corresponding to theoriginal ratio of NPPF pooled prior to PCR. Error bars represent 1standard deviation from the mean. The probes were measured in threeseparate experiments, each of which had a different experimental tagadded during amplification. These were all pooled into a single channelof the sequencer and sequenced. The error bars indicate thereproducibility (1 SD) of the three experiments.

FIGS. 12A and 12B show the plot counts obtained for 12 of the 46 mRNANPPFs from reactions run on a four-point THP1 cell titration. The datashown represent the lowest (A) and highest (B) abundance NPPFs, anddemonstrate the large range of detection obtainable using sequencing. Italso demonstrates the linearity of the qNPS reaction for both high andlow abundance probes (representing high and low expression of thecorresponding RNA in the sample).

FIG. 13 plots the results for five of the thirteen miRNA NPPFs fromreactions run on a three-point HepG2 cell titration (5000 cells-50000cells). These five were chosen because they had similar levels and couldbe clearly seen on the same plot. The plot demonstrates that miRNAs aredetectable in cell lysates using the disclosed methods, and show goodlinearity over the sample sizes tested.

Example 4 Detection of a Plurality of NPPFs Designed to Measure mRNAUsing Sequencing and Capture of the NPPFs on an Array

This example describes methods used to generate and sequence NPPFs.

Nine different NPPFs were generated. Each NPPF included a region thatwas specific for a particular target nucleic acid molecule 25nucleotides in length with a median Tm of 57° C., as well as flankingsequences on both ends. Although the 5′- and 3′-flanking sequencesdiffered, they were the same for each of the different NPPFs. The5′-flanking sequence (5′-AGTTCAGACGTGTGCTCTTCCGATC-3′; SEQ ID NO: 17)was 25 nucleotides with a Tm of 61° C. and the 3′-flanking sequence (5′GATCGTCGGACTGTAGAACTCTGAA-3′; SEQ ID NO: 18) was 25 nucleotides with aTm of 63° C. A biotin label was included on the 5′ flanking sequence ofeach NPPF.

qNPA was performed on samples comprised of dilutions of synthetic RNAs(in vitro transcribed RNAs) in qNPA lysis buffer. Reactions wereperformed in triplicate for each sample concentration. The differentNPPFs were pooled at 166 pM each, and hybridized to the samplesdescribed above, as well as to CFSs complementary to the flankingregions on the NPPFs. CFSs were included in the reaction at a 10-foldmolar ratio (1.6 nM each CFS). Hybridization was performed at 50° C. forat least 16 hours in a total reaction volume of 30 μl. Followinghybridization, 20 μl of S1 reaction buffer was added to the reaction.This buffer is comprised of: 100 mM NaOAc pH 5.0, 250 mM KCl, 22.5 nMZnSO4, and 25 U of S1 nuclease. The S1 reaction was allowed to proceedfor 90 minutes at 50° C. Following S1-mediated digestion of unhybridizedRNA, NPPFs, and CFSs, the reaction was neutralized by addition of 1.50of 1M Tris pH 9.0 and the S1 nuclease was inactivated by heating to 95°C. for 20 minutes. Each resulting reaction contained NPPFs asrepresentatives of the original transcripts in the sample. At thispoint, the reaction was split into two parts.

One part of the unamplified NPPFs was hybridized to an array followingqNPA. The array included an anchor probe bound to bifunctional linkers.One half of the bifunctional linker is complementary to the anchor, andthe other half is complementary to the gene-specific part of the NPPF.The linker thus forms a bridge between the anchor and the NPPF. TheNPPFs from the above reaction were supplemented with a salt replacementbuffer to adjust the reaction to conditions used for array hybridization(salt replacement buffer is: 3.225 M NaCl; 67.5 mM EDTA pH 8.0; 3×SSC;500 mM HEPES pH 7.5) and were incubated with the array for 16 hours at50° C. Binding of the NPPF to the array was detected by the biotin labelpresent on the NPPF using fluorescent streptavadin-phycoerythrin.

The other part of the reaction was prepared for sequencing. The reactionwas first incubated with PCR primers. One primer included a sequencethat was complementary to the 5′-flanking sequence and the second primerincluded a sequence that was complementary to the 3′-flanking sequence.The second primer also included a sequence to allow for incorporation ofa six nucleotide experiment tag into the resulting amplicon, so thateach NPPF amplified using this primer had the same six nucleotideexperiment tag.

The first primer was 49 bases in length. Twenty of these bases wereidentical to the 5′-flanking sequence. These 20 bases had a Tm of 54° C.and the overall Tm of the entire primer was 70° C. The second primer,complementary to the 3′-flanking sequence was 57 nucleotides total witha Tm of about 70° C. The first 19 nucleotides of the second primer wereexactly complementary to the 3′-flanking region and had 5 a Tm of 54° C.

Each triplicate reaction was amplified in a separate PCR reaction, witha separate tag, so that variance could be identified. The resultingamplicons were cleaned up using either gel purification or standardcolumn-based purification (Qiagen QIAQuick spin columns). The ampliconscontaining the NPPF and an experimental tag were then sequenced using anIIlumina platform, using the second index read technique to sequence theexperiment tag, as described in Example 3.

The PCR reactions were also set up to determine the impact of cyclenumber on the sequencing results. Briefly, each triplicate reaction wasamplified in three separate PCR reactions, each reaction, with aseparate tag, so that variance could be identified. These three PCRreactions underwent 10, 12, or 15 cycles of PCR the resulting ampliconswere cleaned up using either gel purification or standard column-basedpurification (Qiagen QIAQuick spin columns). The amplicons containingthe NPPF and an experimental tag were then sequenced using an Illuminaplatform, using the second index read technique to sequence theexperiment tag, as described in Example 3.

Each amplicon sequenced was sorted based first on the experiment tag(barcode), and then within each experiment tag group, the number ofamplicons identified for each of the different tags was counted. Theamplicons were compared to the expected sequences using Bowtie.

FIG. 14 shows that low PCR cycle numbers (10, 12, and 15) do not undulyinfluence sequencing results. The bar graph shows the counts generatedfor each NPPF following sequencing. The number of cycles and the amountof input material in the original sample are indicated. The data werenormalized to allow for comparison of the different cycles and inputlevels. While it is clear that any of these cycles could be used withthe disclosed methods, the increase in material in the samples following15 PCR cycles made subsequent clean up of the sequencing library easier.Greater than 15 cycles produces spurious products larger and smallerthan the desired size of amplicon. Thus, in some examples, the disclosedmethod use 10 to 15 PCR cycles, such as 10, 11, 12, 13, 14, or 15cycles.

FIGS. 15A and 15B show the results from the same triplicate qNPAreactions after splitting. NPPFs were detected by hybridization to anarray (FIG. 15A) or by counting sequenced NPPFs (FIG. 15B). The barsshown are averages of the triplicates, and error bars represent onestandard deviation from the mean.

In view of the many possible embodiments to which the principles of thedisclosed invention may be applied, it should be recognized that theillustrated embodiments are only examples of the disclosure and shouldnot be taken as limiting the scope of the disclosure. Rather, the scopeof the invention is defined by the following claims. We therefore claimas our invention all that comes within the scope and spirit of theseclaims.

We claim:
 1. A method of detecting at least one target nucleic acidmolecule in a sample, comprising: contacting the sample with at leastone nuclease protection probe comprising a flanking sequence (NPPF)under conditions sufficient for the NPPF to specifically bind to thetarget nucleic acid molecule, wherein the NPPF comprises: a 5′-end and a3′-end, a sequence complementary to a region of the target nucleic acidmolecule, permitting specific binding between the NPPF and the targetnucleic acid molecule, a flanking sequence located 5′, 3′, or both, tothe sequence complementary to the target nucleic acid molecule, whereinthe flanking sequence comprises at least 12 contiguous nucleotides notfound in a nucleic acid molecule present in the sample providing auniversal amplification sequence, and wherein the flanking sequence iscomplementary to at least a portion of an amplification primer;contacting the sample with a nucleic acid molecule comprising a sequencecomplementary to the flanking sequence (CFS) under conditions sufficientfor the flanking sequence to specifically bind to the CFS; contactingthe sample with a nuclease specific for single-stranded nucleic acidmolecules under conditions sufficient to remove unbound nucleic acidmolecules, thereby generating a digested sample comprising NPPFshybridized to the target nucleic acid molecule and to the CFS(s);amplifying NPPFs in the digested sample with amplification primer,thereby generating NPPF amplicons; and detecting the NPPF amplicons,thereby detecting the at least one target nucleic acid molecule in thesample.
 2. The method of claim 1, wherein the NPPF comprises a DNAmolecule.
 3. The method of claim 1, wherein: the NPPF comprises 35-150nucleotides; the sequence complementary to a region of the targetnucleic acid molecule is 10-60 nucleotides in length; the flankingsequence is 12 to 50 nucleotides in length; or combinations thereof. 4.The method of claim 1, wherein the NPPF comprises a flanking sequence atthe 5′-end and the 3′-end, wherein the flanking sequence at the 5′-enddiffers from the flanking sequence at the 3′-end.
 5. The method of claim1, wherein the at least one amplification primer further comprises asequence that permits attachment of an experimental tag or sequencingadapter to the NPPF amplicon during the amplification step.
 6. Themethod of claim 1, wherein the flanking sequence further comprises anexperimental tag, sequencing adapter, or both.
 7. The method of claim 6,wherein: the experimental tag comprises a nucleic acid sequence thatpermits identification of a sample, subject, treatment or target nucleicacid sequence; the sequencing adapter comprises a nucleic acid sequencethat permits capture onto a sequencing platform; the experimental tag orsequence tag is present on the 5′-end or 3′-end of the NPPF amplicon; orcombinations thereof.
 8. The method of claim 1, wherein one or moretarget nucleic acid molecules are fixed, cross-linked, or insoluble. 9.The method of claim 1, wherein the NPPF is a DNA and the nucleasecomprises an exonuclease, an endonuclease, or a combination thereof. 10.The method of claim 1, wherein the nuclease specific for single-strandednucleic acid molecules comprises S1 nuclease.
 11. The method of claim 1,wherein the method detects at least one target nucleic acid molecule ina plurality of samples simultaneously.
 12. The method of claim 1,wherein the method detects at least two target nucleic acid molecules,and wherein the sample is contacted with at least two different NPPFs,each NPPF specific for a different target nucleic acid molecule.
 13. Themethod of claim 1, wherein the method is performed on a plurality ofsamples and at least two target nucleic acid molecules are detected ineach of the plurality of samples.
 14. The method of claim 1, wherein atleast one NPPF is specific for a miRNA target nucleic acid molecule andat least one NPPF is specific for an mRNA target nucleic acid molecule.15. The method of claim 1, further comprising lysing the sample.
 16. Themethod of claim 1, wherein detecting the NPPF amplicons comprisescontacting the NPPF amplicons with a surface comprising multiplespatially discrete regions, each region comprising: at least one anchorin association with a bifunctional linker, wherein the bifunctionallinker comprises a first portion which specifically binds to the anchorand a second portion which specifically binds to at least a portion ofone of the NPPF amplicons, under conditions sufficient for the NPPFamplicons to specifically bind to the second portion of the bifunctionallinker; or at least one nucleic acid anchor having a regioncomplementary to at least a portion of one of the NPPF amplicons, underconditions sufficient for the NPPF amplicons to specifically bind to thenucleic acid anchor.
 17. The method of claim 1, wherein detecting theNPPF amplicons comprises contacting the NPPF amplicons with a populationof surfaces, wherein the population of surfaces comprises subpopulationsof surfaces, and wherein: each subpopulation of surfaces comprises atleast one anchor in association with a bifunctional linker comprising afirst portion which specifically binds to at least a portion of one ofthe NPPF amplicons, under conditions sufficient for the NPPF ampliconsto specifically bind to the second portion of the bifunctional linker;or each subpopulation of surfaces comprises at least one nucleic acidanchor having a region complementary to at least a portion of one of theNPPF amplicons, under conditions sufficient for the NPPF amplicons tospecifically bind to the nucleic acid anchor.
 18. The method of claim17, wherein the population of surfaces comprises a population of beads.19. The method of claim 1, wherein the second portion of thebifunctional linker is complementary to the NPPF region complementary tothe region of the target nucleic acid molecule, thereby permittingspecific binding between the NPPF amplicon and the bifunctional linker.20. The method of claim 1, wherein the NPPF amplicon comprises adetectable label.
 21. The method of claim 1, wherein the at least oneNPPF comprises at least 10 NPPFs.
 22. The method of claim 1, wherein thesample is formalin fixed.